puppet_pals
puppet_pals t1_j6ygho0 wrote
ImageNet normalization is an artifact of the era of feature engineering. In the modern era you shouldn’t use it. It’s unintuitive and overfits the research dataset.
puppet_pals t1_j0p5ai3 wrote
Reply to comment by Ok-Teacher-22 in [R] Silent Bugs in Deep Learning Frameworks: An Empirical Study of Keras and TensorFlow by Ok-Teacher-22
> stop being lazy
please keep in mind that people on reddit are usually browsing in their free time and might be on mobile.
---
I dug into this for you...
The issue is that the complex numbers are casted to the default data type of each individual metric, which is usually floats. This is consistent with the behavior of all Keras components. Each component has a `compute_dtype` attribute, which all inputs and outputs are casted to. This allows for mixed precision computation.
Complex numbers are a weird case. The complex numbers get casted to the metric's native dtype, which is float by default, causing the imaginary components to be dropped. For most dtypes theres a logical translation from one to another; i.e. 1->1.0, 2->2.0, etc. There is not one of these to go from complex->float.
In my opinion TensorFlow should raise an error when you cast complex->float, but this is not the case in TensorFlow. I have a strong feeling that we can't change this due to backwards compat, but would have to dig deeper to verify this.
In short this is not really a Keras bug but is rather a weird interaction between Keras' mixed precision support and TensorFlow.
I hope this helps - maybe we can make a push to raise an error when casting from complex->real numbers and force users to call another function ? (i.e tf.real()). I don't know what the "Right" solution is here, but that is the history of why this issue exists.
puppet_pals t1_j0kjvqn wrote
Reply to comment by Ok-Teacher-22 in [R] Silent Bugs in Deep Learning Frameworks: An Empirical Study of Keras and TensorFlow by Ok-Teacher-22
is there a bug report for this? definitely file one if there is not.
​
I encountered a similar silent failure years ago where the gradient some operation on a complex # was 0. https://lukewood.xyz/blog/complex-deep-learning
puppet_pals t1_j0jafew wrote
Reply to [R] Silent Bugs in Deep Learning Frameworks: An Empirical Study of Keras and TensorFlow by Ok-Teacher-22
Awesome work! I work on KerasCV and guarding against silent bugs is my #1 priority in API design. I'll read through this paper, thanks a lot for the hard work in gathering all of these in one place!
puppet_pals t1_izrpa5d wrote
Unfortunately, there are too many variables at play to give you a set in stone answer.
puppet_pals t1_iv4jtpc wrote
Reply to comment by sgjoesg in U-Net architecture by Competitive-Good4690
>So model.fit is an inbuilt function for training the model which I don’t want to use. I want to define the model on my own
just use the UNet using that runs with model.fit() and implement your own training loop following this guide: https://lukewood.xyz/blog/keras-model-walkthrough
puppet_pals t1_ithgyr7 wrote
Reply to comment by johnnymo1 in [D] Building the Future of TensorFlow by eparlan
Oops - yes.
puppet_pals t1_ithf7m4 wrote
Reply to comment by johnnymo1 in [D] Building the Future of TensorFlow by eparlan
Luke here from KerasCV - the object detection API is still under active development. I recently got the RetinaNet to score an MaP of 0.49 on PascalVOC on a private notebook, should be possible with just the standalone package in the 0.4.0 release. I'd say give it a month.
​
The API surface won't change a lot, but the results will get a lot better in the next release.
puppet_pals t1_is5aad2 wrote
Reply to comment by Atom_101 in [D] Are GAN(s) still relevant as a research topic? or is there any idea regarding research on generative modeling? by aozorahime
people are also running with distillation too
puppet_pals t1_j701uqt wrote
Reply to comment by netw0rkf10w in [D] ImageNet normalization vs [-1, 1] normalization by netw0rkf10w
>I think normalization will be here to stay (maybe not the ImageNet one though), as it usually speeds up training.
the reality is you are tied to the normalization scheme of whatever you are transfer learning from. (assuming you are transfer learning). Framework authors and people publishing weights should make normalization as easy as possible; typically via a 1/255.0 rescaling operation (or x/127.5 - 1, I'm indifferent though I opt for 1/255 personally)