Unsupervised X-Ray Denoising

Abstract

Among the plethora of techniques devised to curb the prevalence of noise in medical images, deep learning based approaches have shown most promise. However, one critical limitation of these deep learning based denoisers is the requirement of high quality noiseless ground truth images that are difficult to obtain in many medical imaging applications such as X-rays. To circumvent this issue, we leverage recently proposed approach of this paper that incorporates Stein's Unbiased Risk Estimator (SURE) to train a deep convolutional neural network without requiring denoised ground truth X-ray data.

Main idea

We consider recovering true X-ray image $\boldsymbol{x} \in \mathbb{R}^m$ from its noisy measurements of the form \begin{equation} \boldsymbol{y} = \boldsymbol{x} + \boldsymbol{w}, \end{equation} where $\boldsymbol{y} \in \mathbb{R}^m$ is noise corrupted image, and $\boldsymbol{w} \in \mathbb{R}^m$ denote independent and identically distributed Gaussian noise i.e. $\boldsymbol{w} \sim \mathcal{N}(0,\sigma^2 \boldsymbol{I}) $ where $\boldsymbol{I}$ is identity matrix and $\sigma$ is standard deviation that is assumed to be known. We are interested in a weakly differentiable function $f_\theta(.)$ parameterized by $\theta$ that maps noisy X-ray images $\boldsymbol{y}$ to clean ones $\boldsymbol{x}$. We model $f_\theta(.)$ by a convolutional neural network (CNN) where $\theta$ are weights of this network. CNN based denoising methods are typically trained by taking a representative set of images $\boldsymbol{x}_1, \boldsymbol{x}_2, ...\boldsymbol{x}_L$, along with set of observations $\boldsymbol{y}_1, \boldsymbol{y}_2...\boldsymbol{y}_L$, either physically or using simulations. The network then learns the mapping $f_\theta : \boldsymbol{y} \rightarrow \boldsymbol{x}$ from observations back to images by minimizing a supervised loss function; typically mean-squared-error (MSE) is employed as shown below. \begin{equation} \label{eq:MSE} \underset{\theta}{\text{min}} \quad \sum_{\ell=1}^{L} \frac{1}{L} \Vert \boldsymbol{x}_{\ell} - f_\theta({\boldsymbol{y}_{\ell}}) \Vert^2. \end{equation} Note the dependence of MSE on ground truth images $\boldsymbol{x}$. Instead of optimizing MSE, we employ SURE loss whose objective function with respect to $\boldsymbol{w}$ can be written as \begin{equation} \label{eq:SURE} \sum_{\ell=1}^{L} \frac{1}{L} \Vert \boldsymbol{y}_{\ell} - f_\theta({\boldsymbol{y}_{\ell}}) \Vert^2 - \sigma_{w}^2 + \frac{2 \sigma_{w}^2}{n} \text{div}_{\boldsymbol{y}}{(f_\theta(\boldsymbol{y}))}, \end{equation} where $\text{div}(.)$ denotes divergence and is defined as \begin{equation} \text{div}_{\boldsymbol{y}}{(f_\theta(\boldsymbol{y})} = \sum_{\ell=1}^{L} \frac{\partial f_{\theta n} (\boldsymbol{y})}{\partial y_n} \end{equation} Calculating divergence of the denoiser is a central challenge for SURE based estimators. We estimate divergence via fast Monte Carlo approximation, see \citep{ramani2008monte} for details. In short, instead of utilizing a supervised loss of MSE in \ref{eq:MSE}, we optimize network weights $\boldsymbol{\theta}$ in an unsupervised manner using \ref{eq:SURE}, that does not require ground truth.

Training and Results

We use Indiana University’s Chest X-ray database for training. Database consists of 7470 chest X-ray images of varying size, out of which we select 500 images for training due to scarcity of computational resources. Training images are re-scaled, cropped and flipped, to form a set of 789760 overlapping patches each of size 40 × 40. We use different settings of hyperparameters including learning rate, number of layers, patch size etc. to find optimal values. We trained DnCNN network using SURE loss without any clean data on three different noise levels of 10, 25 and 50. Following figure shows SURE training loss curve for each noise level. Note that network easily converges for small variance noise while higher noise levels make the network hard to converge.

Data Set	Noise Variance	Average PSNR	Average SSIM	Average MSE	Average Time
Indiana University X-Rays Dataset	10	37.50	0.959	07.00	0.06
	25	32.00	0.881	13.15	0.06
	50	27.78	0.763	21.5	0.06
NIH Chest X-Rays Dataset	10	37.55	0.962	13.57	0.19
	25	31.85	0.877	26.14	0.19
	50	28.88	0.774	45.43	0.19

Comparison with other methods

The table shows the performance of SURE based denoiser for two different datasets. SURE based model works comparably well although it does not require any clean dataset. Interestingly, denoiser trained on Indiana University X-Ray Dataset also works well on NIH Chest X-Ray Dataset.

Loss	Noise Variance	Average PSNR	Average SSIM	Average MSE
SURE	25	32.00	0.88	13.15
	50	27.78	0.76	21.50
MSE	25	35.39	0.95	08.96
	50	29.61	0.82	17.32