I've been writing a new textbook on deep learning for publication by MIT Press late this year. The current draft is at:

It contains a lot more detail than most similar textbooks and will likely be useful for all practitioners, people learning about this subject, and anyone teaching it. It's (supposed to be) fairly easy to read and has hundreds of new visualizations.

Most recently, I've added a section on generative models, including chapters on GANs, VAEs, normalizing flows, and diffusion models.

Looking for feedback from the community.

If you are an expert, then what is missing?
If you are a beginner, then what did you find hard to understand?
If you are teaching this, then what can I add to support your course better?

Plus of course any typos or mistakes. It's kind of hard to proof your own 500 page book!

Comments

You must log in or register to comment.

arsenyinfo t1_j5m33oc wrote on January 23, 2023 at 11:37 PM

As a practitioner, I am surprised to see no chapter on finetuning

SimonJDPrince OP t1_j5o0orv wrote on January 24, 2023 at 10:43 AM

Can you give me an example of a review article or chapter in another book that covers roughly what you expect to see?

NoRexTreX t1_j5o9hd8 wrote on January 24, 2023 at 12:29 PM

I can't give an example material outside of just huggingface documentation but it's the big thing right now to leverage pre trained models so if your book doesn't mention it then it's missing the hipest thing. And also adapterhub.

arsenyinfo t1_j5r4b5l wrote on January 24, 2023 at 11:47 PM

Random ideas from the top of my head:

intro why transfer learning works;
old but good https://cs231n.github.io/transfer-learning/;
a concept of catastrophic forgetting;
some intuition on answering empirical questions like what layers should be frozen, how to adapt LR etc.

SimonJDPrince OP t1_j5taba8 wrote on January 25, 2023 at 12:28 PM

Thanks. This is useful.

Apprehensive-Grade81 t1_j5oijh0 wrote on January 24, 2023 at 1:50 PM

Definitely would like something like this. Maybe SOTA benchmarks as well.

new_name_who_dis_ t1_j5oix1c wrote on January 24, 2023 at 1:53 PM

Fine tuning isn’t any different than just training…? You just don’t start with random network, but fine tuning doesn’t really have anything different besides that and the size of the dataset

SimonJDPrince OP t1_j5olux3 wrote on January 24, 2023 at 2:15 PM

That was kind of my impression. And I do discuss this in the chapters on transformers and regularization. Was wondering if there is more to it.

aristotle137 t1_j5m25xi wrote on January 23, 2023 at 11:30 PM

Btw, I absolutely loved your computer vision textbook, clear, comprehensible and so much fun! Best visulations in the biz. Also loved your UCL course on the subject, I was there 2010/2011 -- will definitely check out the next book

fkrhvfpdbn4f0x t1_j5n3ypm wrote on January 24, 2023 at 4:11 AM

>u/SimonJDPrince

could you share a link to a CV textbook

_harias_ t1_j5nnb3t wrote on January 24, 2023 at 7:36 AM

This probably: http://www.computervisionmodels.com/

SimonJDPrince OP t1_j5o0n0t wrote on January 24, 2023 at 10:43 AM

Yup -- some of it is a bit out of date now, but the stuff on probabilistic/graphical models is all still good and so is the geometry.

Comfortable_End5976 t1_j5m4w76 wrote on January 23, 2023 at 11:49 PM

having a skim, it looks good mate. i like your writing style. please let us know once it's published and we can pick up a physical copy.

Philpax t1_j5lqko9 wrote on January 23, 2023 at 10:12 PM

Awesome! I'll add it to my reading list :)

aamir23 t1_j5mzuw2 wrote on January 24, 2023 at 3:37 AM

There's another book with the same title. Understanding deep learning

Qpylon t1_j5nlguz wrote on January 24, 2023 at 7:12 AM

That one’s full title is actually “ Understanding Deep Learning: Application in Rare Event Prediction”

SimonJDPrince OP t1_j5o0gbn wrote on January 24, 2023 at 10:40 AM

Yeah -- I feel a bit bad about that, but as someone else pointed out, the title is not actually the same. I should put a link to this book on my website though, so anyone looking for this book can find it.

taleofbenji t1_j5lw5ie wrote on January 23, 2023 at 10:49 PM

I love your book and refer to it often. I keep hitting F5 for Chapter 19. :-)

K_is_for_Karma t1_j5mh00k wrote on January 24, 2023 at 1:16 AM

how recent is your chapter on generative models? I’m starting to pursue research in the area and need to get up to date

SimonJDPrince OP t1_j5o0jd9 wrote on January 24, 2023 at 10:41 AM

There are five chapters and around 100 pages. I think it would be a good start.

Own_Quality_5321 t1_j5lt463 wrote on January 23, 2023 at 10:29 PM

Nice. I wil have a look and possibly recommend it. Thanks for sharing, that must have been a huge amount of work

promiise t1_j5lu8wl wrote on January 23, 2023 at 10:36 PM

Nice, thanks for sharing your hard-work!

PabloEs_ t1_j5mdolq wrote on January 24, 2023 at 12:52 AM

Looks good and fills a gap, imo there is no good DL book out there. What could be better: state results more clear as a Theorem with all needed assumptions.

profjonathanbriggs t1_j5mljdr wrote on January 24, 2023 at 1:50 AM

Added to my reading stack. Thanks for this. Will revert with comments.

sweetlou357 t1_j5mtytp wrote on January 24, 2023 at 2:51 AM

Wow this looks like an amazing resource!

bacocololo t1_j5obxnl wrote on January 24, 2023 at 12:53 PM

In page 41 just near problem 3.9 you write twice the. Do you need this type of comment too ?

SimonJDPrince OP t1_j5ocgd0 wrote on January 24, 2023 at 12:58 PM

Yes! Any tiny errors (even punctuation) are super useful! Couldn't find this though. Can you give me more info about which sentence?

bacocololo t1_j5of34s wrote on January 24, 2023 at 1:21 PM

just above 3.1.2 sum of slopes from the the regions

TheMachineTookShape t1_j5ohns9 wrote on January 24, 2023 at 1:43 PM

There's another on page 349 in section "Combination with other models":

>...will ensure that the the aggregated posterior...

SimonJDPrince OP t1_j5olz4s wrote on January 24, 2023 at 2:16 PM

Thanks! If you send your real name to the e-mail on the front page of the book, then I'll add you to the acknowledgements.

TheMachineTookShape t1_j5ofex4 wrote on January 24, 2023 at 1:24 PM

What is the most efficient way for someone to tell you about typos, or provide suggestions? I'll try to have a read over the weekend.

TheMachineTookShape t1_j5ofkzb wrote on January 24, 2023 at 1:26 PM

Sorry, you've written the instructions right there on the Web page! Just ignore me...

LornartheBreton t1_j6i180q wrote on January 30, 2023 at 2:05 PM

Please let us know when it's published so I can tell my university to buy some copies for its' library!

[deleted] t1_j5m9vib wrote on January 24, 2023 at 12:24 AM

[removed]

like_a_tensor t1_j5n7akg wrote on January 24, 2023 at 4:40 AM

Very nice work! Do you plan to release any solutions to the problems?

SimonJDPrince OP t1_j5o09dv wrote on January 24, 2023 at 10:37 AM

I will release solutions to about half of them. Have to keep the rest back for professors. You can always message me if you want to know the solution to a particular problem.

like_a_tensor t1_j5o1j14 wrote on January 24, 2023 at 10:55 AM

Sounds great, thanks!

bacocololo t1_j5ng5fs wrote on January 24, 2023 at 6:09 AM

Will be please to look at it, especially the figure explaining algorithms, Thanks

bacocololo t1_j5nk0vo wrote on January 24, 2023 at 6:54 AM

Fitst impression base on transformers figure : You book could become a best seller…

libai123456 t1_j5nkizf wrote on January 24, 2023 at 7:00 AM

I really like the book for it provides many beautiful pictures and gives us many intuitions behind deep learning algorithms, really appreciate the work you have done on this book.

Maximum-Mission-9377 t1_j5nr0e2 wrote on January 24, 2023 at 8:25 AM

Many thanks for sharing

[deleted] t1_j5o68mg wrote on January 24, 2023 at 11:53 AM

[deleted]

SimonJDPrince OP t1_j5ocrdo wrote on January 24, 2023 at 1:00 PM

I'd say that mine is more internally consistent -- all the notation is consistent across all equations and figures. I have made 275 new figures, whereas he has curated existing figures from papers. Mine is more in depth on the topics that it covers (only deep learning), but his has much greater breadth. His is more of a reference work, whereas mine is intended mainly for people learning this for the first time.
Full credit to Kevin Murphy -- writing book is much more work than people think, and so completing that monster is quite an achievement.

Thanks for tip about Hacker News -- that's a good idea.

SatoshiNotMe t1_j5ponoh wrote on January 24, 2023 at 6:25 PM

Looks like a great book so far. I think it is definitely valuable to focus on giving a clear understanding of some topics rather than covering everything while compromising depth of understanding

bythenumbers10 t1_j5pa9iz wrote on January 24, 2023 at 4:57 PM

When to reach for deep learning over older, simpler methods. Just an executive summary to keep folks from sandblasting soda crackers, or being forced to.

SimonJDPrince OP t1_j5pdwwr wrote on January 24, 2023 at 5:20 PM

That's not a bad idea actually!

AdFew4357 t1_j5pqkqb wrote on January 24, 2023 at 6:36 PM

I have one minor gripe about deep learning textbooks. I think they are great references, but should not be used as a way for beginners to get into the field. I genuinely feel like time is better spent on the student going down a rabbit hole of actual papers of maybe one of the chapters of those books, say, a student reads the chapter on graph neural networks and the proceeds to read everything in graph neural networks, rather than read the whole book on different subsections.

SimonJDPrince OP t1_j5q2nbo wrote on January 24, 2023 at 7:50 PM

Agreed -- in some cases. Depends on the level of the student, if they are studying in a class etc. My goal was to write the first thing you should read about each area.

NeoKov t1_j5wmjkr wrote on January 26, 2023 at 1:51 AM

As a novice, I’m not understanding why the test loss continues to increase— in general, but also in Fig. 8.2b, if anyone can explain… The model continues to update and (over)fit throughout testing? I thought it was static after training. And the testing batch is always the same size as the training batch? And they don’t occur simultaneously, right? So the test plot is only generated after the training plot.

SimonJDPrince OP t1_j5yc4n2 wrote on January 26, 2023 at 12:26 PM

You are correct -- they don't usually occur simultaneously. Usually, you would train and then test afterwards, but I've shown the test performance as a function of the number of training iterations, just so you can see what happens with generalization.

(Sometimes people do examine curves like this using validation data, so they can see when the best time to stop training is though)

The test loss goes back up because it classifies some of the test answers wrong. With more training iterations, it becomes more certain about it's answers (e.g., it pushes the likelihood of its chosen class from 0.9 to 0.99 to 0.999 etc.). For the training data, where the everything is classified correctly, that makes it more likely and decreases the loss. For the cases in the test data where its classified wrong, it makes it less likely, and so the loss starts to go back up.

Hope this helps. I will try to clarify in the book. It's always helpful to learn where people are getting confused.

NeoKov t1_j5zvrkz wrote on January 26, 2023 at 6:51 PM

I see, thanks! This seems like a great resource. Thank you for making it available. I’ll post any further questions here, unless GitHub is the preference.

SimonJDPrince OP t1_j648ce9 wrote on January 27, 2023 at 4:32 PM

GitHub or e-mail are better. Only occasionally on Reddit.

NeoKov t1_j60xeip wrote on January 26, 2023 at 10:48 PM

Fig. 8.5 mentions “brown line” for b) but line appears to be black.

SimonJDPrince OP t1_j648umm wrote on January 27, 2023 at 4:35 PM

Thanks! Definitely a mistake. If you send your real name to the e-mail address on the website, I'll add you to the acknowledgements in the book.

Let me know if you find any more.

NihonNoRyu t1_j5md892 wrote on January 24, 2023 at 12:49 AM

Will you add a section for Forward-forward algorithm?

H0lzm1ch3l t1_j5nrium wrote on January 24, 2023 at 8:32 AM

Didn‘t Hinton just try to start talking about it again at NeurIps? Isn‘t it like super irrelevant right now or am I missing something?

SimonJDPrince OP t1_j5o0cj7 wrote on January 24, 2023 at 10:39 AM

I'm planning to add extra material on line for things like this where it's still unclear how important they are. If they get widely adopted, I'll incorporate into next edition.

Nhabls t1_j5m4lxa wrote on January 23, 2023 at 11:47 PM

Obviously haven't had the time to read through it, and this is a clear nitpick but i really don't like when sites like this force you to download the files rather than display it in the browser by default

SimonJDPrince OP t1_j5mfyr1 wrote on January 24, 2023 at 1:09 AM

I'll give people the choice in the end...