Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why you use Dice coefficient as loss value? #3

Closed
tianzq opened this issue Aug 29, 2016 · 27 comments
Closed

Why you use Dice coefficient as loss value? #3

tianzq opened this issue Aug 29, 2016 · 27 comments

Comments

@tianzq
Copy link

tianzq commented Aug 29, 2016

Your VNet works well for segmenting 3D medical image. But the loss value seems not right, which is attached. The output message is listed as follows.

I0828 08:49:53.105847 22596 solver.cpp:214] Iteration 9997, loss = 0.938413
I0828 08:49:53.105891 22596 solver.cpp:229] Train net output #0: loss = 0.938413 (* 1 = 0.938413 loss)
I0828 08:49:53.105898 22596 solver.cpp:486] Iteration 9997, lr = 0.0001
I0828 08:49:58.429683 22596 solver.cpp:214] Iteration 9998, loss = 0.929609
I0828 08:49:58.429729 22596 solver.cpp:229] Train net output #0: loss = 0.929609 (* 1 = 0.929609 loss)
I0828 08:49:58.429738 22596 solver.cpp:486] Iteration 9998, lr = 0.0001
I0828 08:50:03.899958 22596 solver.cpp:214] Iteration 9999, loss = 0.935585
I0828 08:50:03.900004 22596 solver.cpp:229] Train net output #0: loss = 0.935585 (* 1 = 0.935585 loss)

I0828 08:50:03.900012 22596 solver.cpp:486] Iteration 9999, lr = 0.0001

figure_1

I tried to apply your pyLayer.py (dice loss function) to segment 2D image by using BVLC's Caffe, instead of using your 3D Caffe.
I reshaped the score and label in my train_val.prototxt by referring the reshape layers in the "train_noPooling_ResNet_cinque.prototxt". But it doesn't work, my loss curve is shown as follows.
Could you told me what should I do?
figure_2

@faustomilletari
Copy link
Owner

It is supposed to go up. 1.0 is the maximum when your batch size is 1 otherwise 2.0, 3.0 etc.

You can also minimize 1-dice

From your mail I don't get if you are actually approaching 1 or not. The first curve seems great, the second one (mean 0.3) it's not good.

Regards,

Fausto Milletarì
Sent from my iPhone

On 29.08.2016, at 23:47, tianzq [email protected] wrote:

Your VNet works well for segmenting 3D medical image. But the loss value seems not right, which is attached. The output message is listed as follows.

I0828 08:49:53.105847 22596 solver.cpp:214] Iteration 9997, loss = 0.938413
I0828 08:49:53.105891 22596 solver.cpp:229] Train net output #0: loss = 0.938413 (* 1 = 0.938413 loss)
I0828 08:49:53.105898 22596 solver.cpp:486] Iteration 9997, lr = 0.0001
I0828 08:49:58.429683 22596 solver.cpp:214] Iteration 9998, loss = 0.929609
I0828 08:49:58.429729 22596 solver.cpp:229] Train net output #0: loss = 0.929609 (* 1 = 0.929609 loss)
I0828 08:49:58.429738 22596 solver.cpp:486] Iteration 9998, lr = 0.0001
I0828 08:50:03.899958 22596 solver.cpp:214] Iteration 9999, loss = 0.935585
I0828 08:50:03.900004 22596 solver.cpp:229] Train net output #0: loss = 0.935585 (* 1 = 0.935585 loss)

I0828 08:50:03.900012 22596 solver.cpp:486] Iteration 9999, lr = 0.0001

I tried to apply your pyLayer.py (dice loss function) to segment 2D image by using BVLC's Caffe, instead of using your 3D Caffe.
I reshaped the score and label in my train_val.prototxt by referring the reshape layers in the "train_noPooling_ResNet_cinque.prototxt". But it doesn't work, my loss curve is shown as follows.
Could you told me what should I do?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

@tianzq
Copy link
Author

tianzq commented Aug 30, 2016

I am trying to use your dice loss function (pyLayer.py) for segmenting my 2D image data.
Therefore, I used BVLC's Caffe (instead of using your 3D Caffe) with my CNN and your dice loss function. Then I got the second curve, which is not good.

But BVLC's Caffe with my CNN and Softmax loss function works well.

My question is: should I change the source code of BVLC's Caffe to use your dice loss function(pyLayer.py)?

@faustomilletari
Copy link
Owner

it is actually weird that it does not work in 2D, but it can be. I never actually tried with 2D data, but it should be the same!

there are some reshapes to do before feeding the information to softmax, are you doing those? do you have a 2 class problem? is your ground truth binary [0,1]? where did you get the first curve (the one that looks good) from?

Fausto

On 30 Aug 2016, at 23:12, tianzq [email protected] wrote:

I am trying to use your dice loss function (pyLayer.py) for segmenting my 2D image data.
Therefore, I used BVLC's Caffe (instead of using your 3D Caffe) with my CNN and your dice loss function. Then I got the second curve, which is not good.

But BVLC's Caffe with my CNN and Softmax loss function works well.

My question is: should I change the source code of BVLC's Caffe to use your dice loss function(pyLayer.py)?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub #3 (comment), or mute the thread https://github.com/notifications/unsubscribe-auth/AMtsvteVjiQDNxBsTMhxH9rrZQRX_vOFks5qlJzbgaJpZM4Jv7sZ.

@tianzq
Copy link
Author

tianzq commented Aug 30, 2016

I did the reshape according to the file "train_noPooling_ResNet_cinque.prototxt".
My problem is 2 classes segmentation, and ground truth is binary.

I got the first curve by using your 3D Caffe and VNet on PROMISE12 training data.
Iteration is 10000. (the x-axis value is 10000, not 200000)

@faustomilletari
Copy link
Owner

ok, good. you can reproduce the results on promise dataset then. Thank you for your efforts and for having validated ours.

I’m afraid there is something i’m missing there. All your code is on github, right? I can have a look tomorrow maybe. I have not much free time right now but i will try, because i’m curious about what could have gone wrong. Maybe there is some problem in 2D, but why would this be the case? There should be really no difference between 2D and 3D. A student of mine was having these problems when she mixed up ground truth of different patients with images of others, please, even though i’m extremely confident that you have throughly checked, could you make sure that everything is correct at the inputs of the network by visualising your data? also, you could reshape and plot the result of the forward pass directly from pyLayer.py and maybe get some sense of what is happening.

Regards,
Fausto Milletari

On 30 Aug 2016, at 23:24, tianzq [email protected] wrote:

I did the reshape according to the file "train_noPooling_ResNet_cinque.prototxt".
My problem is 2 classes segmentation, and ground truth is binary.

I got the first curve by using your 3D Caffe and VNet on PROMISE12 training data.
Iteration is 10000. (the x-axis value is 10000, not 200000)


You are receiving this because you commented.
Reply to this email directly, view it on GitHub #3 (comment), or mute the thread https://github.com/notifications/unsubscribe-auth/AMtsvoVCQAxHpoWpD9fcDw51T3vRGysKks5qlJ-JgaJpZM4Jv7sZ.

@tianzq
Copy link
Author

tianzq commented Aug 30, 2016

I will try it again by following your advice, and will let you know the result.
Thanks so much for your helpful reply.

@faustomilletari
Copy link
Owner

As a sanity check, can you try with a very small learning rate please?

Fausto Milletarì
Sent from my iPhone

On 30.08.2016, at 23:24, tianzq [email protected] wrote:

I did the reshape according to the file "train_noPooling_ResNet_cinque.prototxt".
My problem is 2 classes segmentation, and ground truth is binary.

I got the first curve by using your 3D Caffe and VNet on PROMISE12 training data.
Iteration is 10000. (the x-axis value is 10000, not 200000)


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@faustomilletari
Copy link
Owner

Hello,
do you have some updates about this issue?

@tianzq
Copy link
Author

tianzq commented Sep 7, 2016

Thanks so much for your asking. I was on vacation last week.
I will let you know the result as soon as possible.

@tianzq
Copy link
Author

tianzq commented Sep 8, 2016

I checked the input images and labels by visualising them, which are correct.
Very small learning rates were also tested, which doesn't work.

I plot the result of the forward pass by using the following command:
(pdb) plt.imshow(solver.net.blobs['score'].data[0].argmax(axis=0))
I found that the dice loss function prefers to assign the whole image with label 1 (red color).
figure_2

I also plot the forward pass by using softmax loss function as follows, which has a better result.
figure_1

@faustomilletari
Copy link
Owner

OK. Let me look into it

Fausto Milletarì
Sent from my iPhone

On 08.09.2016, at 21:07, tianzq [email protected] wrote:

I checked the input images and labels by visualising them, which are correct.
Very small learning rates were also tested, which doesn't work.

I plot the result of the forward pass by using the following command:
(pdb) plt.imshow(solver.net.blobs['score'].data[0].argmax(axis=0))
I found that the dice loss function prefers to assign the whole image with label 1 (red color).

I also plot the forward pass by using softmax loss function as follows, which has a better result.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@faustomilletari
Copy link
Owner

Could tell me what happens by training optimizing over and over on the same image. Eg. A training set composed by only one image.

Regards,

Fausto Milletarì
Sent from my iPhone

On 08.09.2016, at 21:07, tianzq [email protected] wrote:

I checked the input images and labels by visualising them, which are correct.
Very small learning rates were also tested, which doesn't work.

I plot the result of the forward pass by using the following command:
(pdb) plt.imshow(solver.net.blobs['score'].data[0].argmax(axis=0))
I found that the dice loss function prefers to assign the whole image with label 1 (red color).

I also plot the forward pass by using softmax loss function as follows, which has a better result.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@tianzq
Copy link
Author

tianzq commented Sep 8, 2016

There is no difference between one training image and whole training data set.
Training set is only one 2D prostate image. The label and forward result are also attached.
img
label
foward

@faustomilletari
Copy link
Owner

do we have news on this?

@sagarhukkire
Copy link
Contributor

@tianzq

Have you tried weighted cross entropy ?

Please let me know because of I am facing issue with it

thanks

@tianzq
Copy link
Author

tianzq commented May 4, 2017 via email

@tianzq tianzq closed this as completed May 4, 2017
@sagarhukkire
Copy link
Contributor

@tianzq

Can you please share me the solver parameters and prototxt for same. So I will try . I am badly stuck last 10 days to get out of it

Thanks in advance

@tianzq
Copy link
Author

tianzq commented May 6, 2017 via email

@sara-eb
Copy link

sara-eb commented Jun 9, 2017

@tianzq @faustomilletari
Hi, I am doing binary segmentation on 2D (256*256) medical images. I changed the reshape model as follows:

[...]
layer {
name: "score_2classes"
type: "Convolution"
bottom: "score"
top: "score_2classes"
convolution_param {
num_output: 2
pad: 0
kernel_size: 1
}
}
layer {
    name: "reshapelab"
    type: "Reshape"
    bottom: "label"
    top: "label_flat"
    reshape_param {
      shape {
        dim: 1  # copy the dimension from below
        dim: 1
        dim: 65536
      }
    }
}
layer {
    name: "reshaperes"
    type: "Reshape"
    bottom: "score_2classes"
    top: "conv_flat"
    reshape_param {
      shape {
        dim: 1  # copy the dimension from below
        dim: 2
        dim: 65536
      }
    }
}
layer {
name: "softmax"
type: "Softmax"
bottom: "conv_flat"
top: "softmax_out"
}
layer {
  type: 'Python'
  name: 'loss'
  top: 'loss'
  bottom: 'softmax_out'
  bottom: 'label_flat'

  python_param {
    # the module name -- usually the filename -- that needs to be in $PYTHONPATH
    module: 'pyLayer'
    # the layer name -- the class name in the module
    layer: 'DiceLoss'
  }
  # set loss weight so Caffe knows this is a loss layer.
  # since PythonLayer inherits directly from Layer, this isn't automatically
  # known to Caffe
  loss_weight: 1
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "score_2classes"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}

However, I am getting zero loss all the time. What I am doing wrong? could u please guide me?Thanks

@happyzhouch
Copy link

@sara-eb @faustomilletari Hello,I meet the same problem and get zero loss all the time.
The problem is maybe the following. But I don't know solver it ?
/home/data/caffer/python/pyLayer.py:65: RuntimeWarning: invalid value encountered in divide
(self.gt[i, :] * self.union[i]) / ((self.union[i]) ** 2) - 2.0probi,1,: / (
/home/data/caffer/python/pyLayer.py:66: RuntimeWarning: invalid value encountered in divide
(self.union[i]) ** 2))
/home/data/caffer/python/pyLayer.py:68: RuntimeWarning: invalid value encountered in divide
(self.gt[i, :] * self.union[i]) / ((self.union[i]) ** 2) - 2.0probi,1,: / (
/home/data/caffer/python/pyLayer.py:69: RuntimeWarning: invalid value encountered in divide
(self.union[i]) ** 2))

@amh28
Copy link

amh28 commented Oct 23, 2017

@tianzq @faustomilletari do you have any updates on what went wrong on @tianzq 's code?

@rishabhsshah
Copy link

rishabhsshah commented Mar 9, 2018

@faustomilletari @tianzq @amh28 @happyzhouch @sagarhukkire I was facing the same issues that @tianzq had mentioned in his comments. I have implemented Dice loss layer in Python for Caffe that works on 2D images. Please find the same at the following link.:
Dice loss in Python

@nabsabraham
Copy link

@Im-Rishabh can you reattach your loss layer? Error 404 when I click your link. Also, I'm not familiar with caffe, would it be easy to transcribe it into python for use in Keras?

@rishabhsshah
Copy link

rishabhsshah commented May 8, 2018

@BrownPanther here is the updated link. Dice loss in Python. Do let me know how it works for you. I am not aware of implementation details in Keras therefore cannot assure you. However, as a general comment it should be easy if you understand the code that I wrote.

@nabsabraham
Copy link

@Im-Rishabh thanks! I can see it now - this dice loss can only be used for binary masks correct? Do you know how to implement it for multiclass without having to use cross entropy as a loss function?

@rishabhsshah
Copy link

@BrownPanther that's right! This currently works with binary masks. Just to clarify, I am not using cross entropy in my implementation. I have not attempted multi-class dice loss however you can check this link for some pointers. If you don't mind sharing what type of data are you working on? I could help you with appropriate loss functions for your use case.

@nabsabraham
Copy link

nabsabraham commented May 9, 2018

@Im-Rishabh Thanks for that link! I will check it out shortly. I am using the BraTS 2015 dataset (you have to sign up to get access to it). There are 4 input brain images from different imaging modalities and the ground truth mask has 4 classes encoded in an 8-bit image. See attached. The naive way to use dice to train this would be to extract each mask individually and essentially duplicate/quadruple the inputs to match this - is that right? I feel like this would be very computationally inefficient. Any insight would be great! Thanks so much for your help :)
patient2-gt_axial_slice 104

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants