<?xml version="1.0" encoding="utf-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">INFORMATICA</journal-id>
<journal-title-group><journal-title>Informatica</journal-title></journal-title-group>
<issn pub-type="epub">1822-8844</issn>
<issn pub-type="ppub">0868-4952</issn>
<issn-l>0868-4952</issn-l>
<publisher>
<publisher-name>Vilnius University</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">INFOR421</article-id>
<article-id pub-id-type="doi">10.15388/20-INFOR421</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Research Article</subject></subj-group></article-categories>
<title-group>
<article-title>Perceptual Autoencoder for Compressive Sensing Image Reconstruction</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Ralašić</surname><given-names>Ivan</given-names></name><email xlink:href="ivan.ralasic@fer.hr">ivan.ralasic@fer.hr</email><xref ref-type="aff" rid="j_infor421_aff_001"/><xref ref-type="corresp" rid="cor1">∗</xref><bio>
<p><bold>I. Ralasic</bold> received his BSc degree in computing and MSc degree in information and communication technology from the University of Zagreb, in 2014 and 2016, respectively. He is currently pursuing the PhD degree at the University of Zagreb, Faculty of Electrical Engineering and Computing. His current research interests include signal processing, compressive sensing, sparse modelling and machine learning. He is a Student Member of IEEE.</p></bio>
</contrib>
<contrib contrib-type="author">
<name><surname>Seršić</surname><given-names>Damir</given-names></name><xref ref-type="aff" rid="j_infor421_aff_001"/><bio>
<p><bold>D. Seršić</bold> received the diploma degree and the MS and PhD degrees in electrical engineering from the University of Zagreb, Zagreb, Croatia, in 1986, 1993, and 1999, respectively. Since 1987, he has been with the Faculty of Electrical Engineering and Computing, University of Zagreb, where he is currently a full professor. His current research interests include theory and applications of wavelets, advanced signal and image processing, adaptive systems, blind source separation, and compressive sensing. Dr. Seršić is a member of the European Association for Signal Processing. From 2006 to 2008, he served as the chair for the Croatian IEEE Signal Processing Chapter.</p></bio>
</contrib>
<contrib contrib-type="author">
<name><surname>Šegvić</surname><given-names>Siniša</given-names></name><xref ref-type="aff" rid="j_infor421_aff_001"/><bio>
<p><bold>S. Šegvić</bold> received his PhD degree in computer science, in 2004. He spent one year as a post-doctoral researcher at IRISA/INRIA, Rennes, France, and also at TU Graz, Austria. He is currently a full professor at UniZg-FER. His research and professional interests focus on lightweight convolutional architectures for semantic segmentation, detection, re-identification, outlier detection, and semantic forecasting.</p></bio>
</contrib>
<aff id="j_infor421_aff_001"><institution>University of Zagreb</institution>, Faculty of Electrical Engineering and Computing, Unska 3, Zagreb, HR-10000, <country>Croatia</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2020</year></pub-date>
<pub-date pub-type="epub"><day>17</day><month>6</month><year>2020</year></pub-date><volume>31</volume><issue>3</issue><fpage>561</fpage><lpage>578</lpage>
<history>
<date date-type="received"><month>8</month><year>2019</year></date>
<date date-type="accepted"><month>5</month><year>2020</year></date>
</history>
<permissions><copyright-statement>© 2020 Vilnius University</copyright-statement><copyright-year>2020</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>This paper presents a non-iterative deep learning approach to compressive sensing (CS) image reconstruction using a convolutional autoencoder and a residual learning network. An efficient measurement design is proposed in order to enable training of the compressive sensing models on normalized and mean-centred measurements, along with a practical network initialization method based on principal component analysis (PCA). Finally, perceptual residual learning is proposed in order to obtain semantically informative image reconstructions along with high pixel-wise reconstruction accuracy at low measurement rates.</p>
</abstract>
<kwd-group>
<label>Key words</label>
<kwd>compressive sensing</kwd>
<kwd>convolutional autoencoder</kwd>
<kwd>deep learning</kwd>
<kwd>image reconstruction</kwd>
<kwd>perceptual loss</kwd>
<kwd>principal component analysis</kwd>
</kwd-group>
<funding-group>
<award-group>
<funding-source xlink:href="https://doi.org/10.13039/501100004488">Croatian Science Foundation</funding-source>
<award-id>IP-2014-09-2625</award-id>
<award-id>IP-2019-04-6703</award-id>
</award-group>
<award-group>
<funding-source xlink:href="https://doi.org/10.13039/501100008530">European Regional Development Fund</funding-source>
<award-id>KK.01.1.1.01.0009</award-id>
</award-group>
<funding-statement>This work was supported in part by the Croatian Science Foundation under Projects IP-2014-09-2625 and IP-2019-04-6703, and in part by the European Regional Development Fund under Grant KK.01.1.1.01.0009 (DATACROSS). </funding-statement>
</funding-group>
</article-meta>
</front>
<body>
<sec id="j_infor421_s_001">
<label>1</label>
<title>Introduction</title>
<p>Compressive sensing (CS) is a signal processing technique that enables accurate signal recovery from an incomplete set of measurements (Candes and Tao, <xref ref-type="bibr" rid="j_infor421_ref_008">2006</xref>; Baraniuk, <xref ref-type="bibr" rid="j_infor421_ref_003">2007</xref>; Duarte and Eldar, <xref ref-type="bibr" rid="j_infor421_ref_014">2011</xref>; Duarte and Baraniuk, <xref ref-type="bibr" rid="j_infor421_ref_015">2012</xref>): 
<disp-formula id="j_infor421_eq_001">
<label>(1)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mo>=</mml:mo><mml:mi mathvariant="bold">Φ</mml:mi><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>+</mml:mo><mml:mi mathvariant="bold-italic">ϵ</mml:mi><mml:mo mathvariant="normal">,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[ \boldsymbol{y}=\boldsymbol{\Phi }\boldsymbol{x}+\boldsymbol{\epsilon },\]]]></tex-math></alternatives>
</disp-formula> 
where <bold>Φ</bold> is an <inline-formula id="j_infor421_ineq_001"><alternatives>
<mml:math><mml:mi mathvariant="italic">M</mml:mi><mml:mo>×</mml:mo><mml:mi mathvariant="italic">N</mml:mi></mml:math>
<tex-math><![CDATA[$M\times N$]]></tex-math></alternatives></inline-formula> measurement matrix, <inline-formula id="j_infor421_ineq_002"><alternatives>
<mml:math><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mo stretchy="false">∈</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">M</mml:mi></mml:mrow></mml:msup></mml:math>
<tex-math><![CDATA[$\boldsymbol{y}\in {\mathbb{R}^{M}}$]]></tex-math></alternatives></inline-formula> is a set of <italic>M</italic> measurements (where <italic>M</italic> can be much smaller than the original dimensionality of the signal <italic>N</italic>), and <inline-formula id="j_infor421_ineq_003"><alternatives>
<mml:math><mml:mi mathvariant="bold-italic">ϵ</mml:mi></mml:math>
<tex-math><![CDATA[$\boldsymbol{\epsilon }$]]></tex-math></alternatives></inline-formula> is measurement noise. Efficient signal recovery is possible even in the case when the number of the acquired measurements is far below the Shannon-Nyquist limit.</p>
<p>The CS reconstruction process can be observed as a linear inverse problem that occurs in numerous image processing tasks such as inpainting (Bertalmio <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_006">2000</xref>; Bugeau <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_007">2010</xref>), super-resolution (Yang <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_042">2010</xref>; Dong <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_010">2016</xref>), and denoising (Elad and Aharon, <xref ref-type="bibr" rid="j_infor421_ref_017">2006</xref>). In order to reconstruct the signal <inline-formula id="j_infor421_ineq_004"><alternatives>
<mml:math><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math>
<tex-math><![CDATA[$\boldsymbol{x}$]]></tex-math></alternatives></inline-formula> from a set of measurements <inline-formula id="j_infor421_ineq_005"><alternatives>
<mml:math><mml:mi mathvariant="bold-italic">y</mml:mi></mml:math>
<tex-math><![CDATA[$\boldsymbol{y}$]]></tex-math></alternatives></inline-formula>, one has to solve the underdetermined (i.e. <inline-formula id="j_infor421_ineq_006"><alternatives>
<mml:math><mml:mi mathvariant="italic">M</mml:mi><mml:mo mathvariant="normal">&lt;</mml:mo><mml:mi mathvariant="italic">N</mml:mi></mml:math>
<tex-math><![CDATA[$M<N$]]></tex-math></alternatives></inline-formula>) system of linear equations in Eq. (<xref rid="j_infor421_eq_001">1</xref>). In the CS literature, the ratio <inline-formula id="j_infor421_ineq_007"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo>=</mml:mo><mml:mpadded width="0pt"><mml:mphantom><mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo></mml:mphantom></mml:mpadded><mml:mi mathvariant="italic">M</mml:mi><mml:mo mathvariant="normal" stretchy="false">/</mml:mo><mml:mpadded width="0pt"><mml:mphantom><mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo></mml:mphantom></mml:mpadded><mml:mi mathvariant="italic">N</mml:mi></mml:math>
<tex-math><![CDATA[$r=\phantom{(}M/\phantom{(}N$]]></tex-math></alternatives></inline-formula> is called the CS measurement rate. In order to recover the signal <inline-formula id="j_infor421_ineq_008"><alternatives>
<mml:math><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math>
<tex-math><![CDATA[$\boldsymbol{x}$]]></tex-math></alternatives></inline-formula> from its low dimensional measurements, it is necessary to use a signal prior that enables the identification of a true solution from an infinite set of feasible solutions. This is usually done by introducing a regularization term to an existing loss function. Usually, the <inline-formula id="j_infor421_ineq_009"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">l</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${l_{0}}$]]></tex-math></alternatives></inline-formula> norm, or its convex relaxation, the <inline-formula id="j_infor421_ineq_010"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">l</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${l_{1}}$]]></tex-math></alternatives></inline-formula> norm, is used as the regularizer under the assumption that the observed signal is sparse in certain transformation domain <bold>Ψ</bold>: 
<disp-formula id="j_infor421_eq_002">
<label>(2)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:mi mathvariant="bold-italic">s</mml:mi><mml:mo>=</mml:mo><mml:mi mathvariant="bold">Ψ</mml:mi><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo mathvariant="normal">,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[ \boldsymbol{s}=\boldsymbol{\Psi }\boldsymbol{x},\]]]></tex-math></alternatives>
</disp-formula> 
where <inline-formula id="j_infor421_ineq_011"><alternatives>
<mml:math><mml:mi mathvariant="bold-italic">s</mml:mi></mml:math>
<tex-math><![CDATA[$\boldsymbol{s}$]]></tex-math></alternatives></inline-formula> denotes the sparse representation of the signal <inline-formula id="j_infor421_ineq_012"><alternatives>
<mml:math><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math>
<tex-math><![CDATA[$\boldsymbol{x}$]]></tex-math></alternatives></inline-formula>. Other signal priors can be used as regularizers as well. An unconstrained optimization problem for the sparse signal recovery using <inline-formula id="j_infor421_ineq_013"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">l</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${l_{1}}$]]></tex-math></alternatives></inline-formula> regularization can be written as: 
<disp-formula id="j_infor421_eq_003">
<label>(3)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:munder><mml:mrow><mml:mo movablelimits="false">min</mml:mo></mml:mrow><mml:mrow><mml:mi mathvariant="bold-italic">s</mml:mi></mml:mrow></mml:munder><mml:mo maxsize="1.19em" minsize="1.19em" stretchy="true">‖</mml:mo><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mo>−</mml:mo><mml:mi mathvariant="bold">Φ</mml:mi><mml:msup><mml:mrow><mml:mi mathvariant="bold">Ψ</mml:mi></mml:mrow><mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mi mathvariant="bold-italic">s</mml:mi><mml:msubsup><mml:mrow><mml:mo maxsize="1.19em" minsize="1.19em" stretchy="true">‖</mml:mo></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>+</mml:mo><mml:mi mathvariant="italic">λ</mml:mi><mml:mo stretchy="false">‖</mml:mo><mml:mi mathvariant="bold-italic">s</mml:mi><mml:msub><mml:mrow><mml:mo stretchy="false">‖</mml:mo></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[ \underset{\boldsymbol{s}}{\min }\big\| \boldsymbol{y}-\boldsymbol{\Phi }{\boldsymbol{\Psi }^{-1}}\boldsymbol{s}{\big\| _{2}^{2}}+\lambda \| \boldsymbol{s}{\| _{1}}.\]]]></tex-math></alternatives>
</disp-formula> 
Most of the algorithms for solving sparse optimization problems are iterative and have high computational complexity (Mallat and Zhifeng, <xref ref-type="bibr" rid="j_infor421_ref_027">2006</xref>; Pati <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_033">1993</xref>; Needell and Tropp, <xref ref-type="bibr" rid="j_infor421_ref_031">2009</xref>; Beck and Teboulle, <xref ref-type="bibr" rid="j_infor421_ref_004">2009</xref>; Becker <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_005">2011</xref>; Wright <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_040">2009</xref>). This presents a serious drawback when it comes to the real-world applications of CS.</p>
<p>After being successfully applied to numerous previously mentioned image processing tasks, machine learning methods started to gain more interest in the area of CS (Mousavi <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_029">2015</xref>; Mousavi and Baraniuk, <xref ref-type="bibr" rid="j_infor421_ref_028">2017</xref>; Mousavi <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_030">2017</xref>; Kulkarni <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_025">2016</xref>; Hantao <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_018">2019</xref>; Lohit <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_026">2018</xref>). Novel CS reconstruction algorithms based on deep neural networks have recently been proposed, and they represent a non-iterative, fast and efficient alternative to the traditional CS reconstruction algorithms.</p>
</sec>
<sec id="j_infor421_s_002">
<label>2</label>
<title>Related Work</title>
<p>A deep learning framework based on the stacked denoising autoencoder (SDA) has been proposed in Mousavi <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_029">2015</xref>) and it represents pioneer work in the area of CS reconstruction using the learning-based approach. The main drawback of the SDA approach is that the network consists of fully-connected layers, which means that all units in two consecutive layers are connected to each other. Thus, as the signal size increases, so does the computational complexity of the neural network. Authors present an extension of their previous work in Mousavi and Baraniuk (<xref ref-type="bibr" rid="j_infor421_ref_028">2017</xref>) and Mousavi <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_030">2017</xref>). The DeepInverse network proposed in Mousavi and Baraniuk (<xref ref-type="bibr" rid="j_infor421_ref_028">2017</xref>) solves the image dimensionality problem by using the adjoint operator <inline-formula id="j_infor421_ineq_014"><alternatives>
<mml:math><mml:msup><mml:mrow><mml:mi mathvariant="bold">Φ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msup></mml:math>
<tex-math><![CDATA[${\boldsymbol{\Phi }^{T}}$]]></tex-math></alternatives></inline-formula> to initialize the weights of the fully connected reconstruction layer. In Mousavi <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_030">2017</xref>), a non-linear measurement operator is trained to learn a transformation from the original signal space to an undersampled measurement space. A novel class of convolutional neural networks (CNN) architectures inspired by the work of Dong <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_010">2016</xref>) was proposed in Kulkarni <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_025">2016</xref>). The proposed CNN takes image block CS measurements as inputs and outputs a block reconstruction obtained from low-dimensional measurements. Improved ReconNet was proposed in Lohit <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_026">2018</xref>), where the authors use adversarial loss to further improve the CS reconstruction results. Moreover, the authors add a linear fully connected layer to the existing ReconNet architecture and learn the optimal measurement and reconstruction matrix in a single network. Based on their initial work in Xie <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_041">2017</xref>) and Du <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_013">2019</xref>), the authors propose to train the neural network using perceptual loss in Du <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_012">2018</xref>). Perceptual loss (Johnson <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_021">2016</xref>) is defined in the latent space of a secondary network and helps to preserve higher level information when compared to the commonly used per-pixel Euclidean loss. In Hantao <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_018">2019</xref>), the authors propose a novel <italic>Deep Residual Reconstruction Network</italic> (<inline-formula id="j_infor421_ineq_015"><alternatives>
<mml:math><mml:msup><mml:mrow><mml:mtext>DR</mml:mtext></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mtext>-Net</mml:mtext></mml:math>
<tex-math><![CDATA[${\text{DR}^{2}}\text{-Net}$]]></tex-math></alternatives></inline-formula>) to restore the image from its blockwise CS measurements with an additional residual layer that enhances the preliminary image reconstruction.</p>
<p>In this paper, we propose an efficient deep learning model for CS acquisition and reconstruction. Our model is based on a fully convolutional autoencoder with a residual network. Fully convolutional architecture alleviates the signal dimensionality problems that occur in the full-connected network design (Mousavi <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_029">2015</xref>). Disadvantage of using the fully convolutional architecture is that it is not directly applicable to certain imaging modalities where the measurements correspond to the whole signal, and one cannot perform measurements in a blockwise manner. In contrast to Mousavi <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_030">2017</xref>) where the authors propose to learn a non-linear measurement operator in their <italic>DeepCodec</italic> network, we use a linear encoding part while the non-linearities are introduced only into the residual learning network. Motivation for this is to ensure that the learned measurement operator is implementable in the real-world CS measurement systems which are mostly linear. The residual network improves the initial image reconstruction and removes eventual reconstruction artifacts.</p>
<p>Although it is well known that normalization of the training data significantly speeds up the training procedure (Ioffe and Szegedy, <xref ref-type="bibr" rid="j_infor421_ref_020">2015</xref>), applying the measurement normalization in the learning-based CS is not straightforward. In order to normalize and mean-centre the CS measurements, measurement process has to be redesigned. Mean values of the observed image blocks have to be known in order to perform mean-centring and normalization. Therefore, we dedicate a single measurement vector to measure the mean value of the observed image block. The rest of the measurement matrix is optimized in the training process. Input to the decoding part of the proposed model are mean-centred measurements, and the decoding process results in mean-centred image reconstruction. Mean value for the observed block is then added to the initial image estimate to obtain the final image reconstruction. Without the proposed modifications of the measurement process, performing reconstruction on normalized measurements would not be feasible. As expected, we show that the measurement normalization process speeds up the convergence of the network significantly.</p>
<p>Furthermore, we discuss the connection between the linear autoencoder network and principal component analysis (PCA). Based on our observations, an efficient method for initialization of the network weights is proposed. The proposed method serves as a bootstraping step in the network training procedure. Instead of initializing the model using random weights, we propose to use an educated guess for the initial weights by using the PCA initialization method.</p>
<p>Finally, we introduce perceptual loss in the residual network training in order to improve the reconstructions at extremely low measurement rates. Experimental results obtained using the proposed model show improvements in terms of the reconstruction quality.</p>
<p>The paper is organized as follows: in Section <xref rid="j_infor421_s_004">3.1</xref>, convolutional autoencoder for CS image reconstruction is proposed. Section <xref rid="j_infor421_s_005">3.2</xref> and Section <xref rid="j_infor421_s_007">3.4</xref> offer a discussion on the measurement matrix optimality and efficient network initialization. Section <xref rid="j_infor421_s_006">3.3</xref> introduces the normalized measurement process. Finally, perceptual residual learning is introduced in Section <xref rid="j_infor421_s_008">3.5</xref> in order to improve the image reconstructions obtained by the autoencoder. Section <xref rid="j_infor421_s_010">4</xref> presents the main results with discussion, while Section <xref rid="j_infor421_s_015">6</xref> offers the conclusion.</p>
</sec>
<sec id="j_infor421_s_003">
<label>3</label>
<title>Proposed Architecture for the CS Model</title>
<sec id="j_infor421_s_004">
<label>3.1</label>
<title>Convolutional Autoencoder</title>
<p>The encoding part of the proposed shallow autoencoder network performs the CS measurement process on an input image, while the decoding part models the CS reconstruction process and reconstructs the input image from the low-dimensional measurement space (Fig. <xref rid="j_infor421_fig_001">1</xref>).</p>
<fig id="j_infor421_fig_001">
<label>Fig. 1</label>
<caption>
<p>Proposed design of the CS image reconstruction model. The convolutional autoencoder learns the end-to-end CS mapping. The encoder performs synthetic measurements on the input image, transforming it into the low-dimensional measurement space. The decoding part learns the optimal inverse mapping from the low-dimensional measurements into the intermediate image reconstruction. The residual network additionally improves the initial image reconstruction.</p>
</caption>
<graphic xlink:href="infor421_g001.jpg"/>
</fig>
<p>In the traditional CS measurement process, an image is vectorized to form a one-dimensional vector <inline-formula id="j_infor421_ineq_016"><alternatives>
<mml:math><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo stretchy="false">∈</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">N</mml:mi></mml:mrow></mml:msup></mml:math>
<tex-math><![CDATA[$\boldsymbol{x}\in {\mathbb{R}^{N}}$]]></tex-math></alternatives></inline-formula> and is projected into a low-dimensional measurement vector <inline-formula id="j_infor421_ineq_017"><alternatives>
<mml:math><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mo stretchy="false">∈</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">M</mml:mi></mml:mrow></mml:msup></mml:math>
<tex-math><![CDATA[$\boldsymbol{y}\in {\mathbb{R}^{M}}$]]></tex-math></alternatives></inline-formula> using an inner product with a collection of measurement vectors <inline-formula id="j_infor421_ineq_018"><alternatives>
<mml:math><mml:msubsup><mml:mrow><mml:mo fence="true" stretchy="false">{</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msub><mml:mo fence="true" stretchy="false">}</mml:mo></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi mathvariant="italic">M</mml:mi></mml:mrow></mml:msubsup></mml:math>
<tex-math><![CDATA[${\{{\boldsymbol{\phi }_{m}}\}_{m=1}^{M}}$]]></tex-math></alternatives></inline-formula>: 
<disp-formula id="j_infor421_eq_004">
<label>(4)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi mathvariant="italic">y</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mo fence="true" stretchy="false">⟨</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msub><mml:mo mathvariant="normal">,</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo fence="true" stretchy="false">⟩</mml:mo><mml:mo>=</mml:mo>
<mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mstyle displaystyle="true"><mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi mathvariant="italic">N</mml:mi></mml:mrow></mml:munderover><mml:msub><mml:mrow><mml:mi mathvariant="italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi><mml:mo mathvariant="normal">,</mml:mo><mml:mi mathvariant="italic">i</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi mathvariant="italic">x</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[ {y_{m}}=\langle {\boldsymbol{\phi }_{m}},\boldsymbol{x}\rangle ={\sum \limits_{i=1}^{N}}{\phi _{m,i}}{x_{i}}.\]]]></tex-math></alternatives>
</disp-formula> 
The measurement matrix <bold>Φ</bold> is created by arranging the measurement vectors <inline-formula id="j_infor421_ineq_019"><alternatives>
<mml:math><mml:msubsup><mml:mrow><mml:mi mathvariant="bold-italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msubsup></mml:math>
<tex-math><![CDATA[${\boldsymbol{\phi }_{m}^{T}}$]]></tex-math></alternatives></inline-formula> as rows. Signal dimensionality (i.e. image dimensions) determines the number of columns in the measurement matrix. Consequently, when image dimensions are large, a block-based CS approach is suitable since it operates on local image patches (Du <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_011">2012</xref>). The block-based CS results in a lower computational complexity and requires less memory to store the measurement matrix.</p>
<p>In this paper, a linear convolutional layer performs decimated convolution, as in Eq. (<xref rid="j_infor421_eq_005">5</xref>), in order to obtain the measurements. Convolution can be used as an extension of the inner product in which the inner product is computed repeatedly over the image space. 
<disp-formula id="j_infor421_eq_005">
<label>(5)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:mtable displaystyle="true" columnspacing="0pt" columnalign="right left"><mml:mtr><mml:mtd/><mml:mtd><mml:mi mathvariant="italic">Y</mml:mi><mml:mo>=</mml:mo><mml:mi mathvariant="italic">X</mml:mi><mml:mspace width="2.5pt"/><mml:munder><mml:mrow><mml:mo>∗</mml:mo><mml:mo>∗</mml:mo></mml:mrow><mml:mrow><mml:mi mathvariant="italic">D</mml:mi></mml:mrow></mml:munder><mml:mspace width="2.5pt"/><mml:mo fence="true" stretchy="false">{</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msub><mml:mo fence="true" stretchy="false">}</mml:mo><mml:mo mathvariant="normal">,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:msub><mml:mrow><mml:mi mathvariant="italic">Y</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msub><mml:mo fence="true" stretchy="false">[</mml:mo><mml:mi mathvariant="italic">i</mml:mi><mml:mo mathvariant="normal">,</mml:mo><mml:mi mathvariant="italic">j</mml:mi><mml:mo fence="true" stretchy="false">]</mml:mo><mml:mo>=</mml:mo>
<mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mstyle displaystyle="true"><mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle></mml:mrow><mml:mrow><mml:mi mathvariant="italic">k</mml:mi></mml:mrow><mml:mrow/></mml:munderover>
<mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mstyle displaystyle="true"><mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle></mml:mrow><mml:mrow><mml:mi mathvariant="italic">l</mml:mi></mml:mrow><mml:mrow/></mml:munderover><mml:mi mathvariant="italic">X</mml:mi><mml:mo fence="true" stretchy="false">[</mml:mo><mml:mi mathvariant="italic">D</mml:mi><mml:mi mathvariant="italic">i</mml:mi><mml:mo>+</mml:mo><mml:mi mathvariant="italic">k</mml:mi><mml:mo mathvariant="normal">,</mml:mo><mml:mi mathvariant="italic">D</mml:mi><mml:mi mathvariant="italic">j</mml:mi><mml:mo>+</mml:mo><mml:mi mathvariant="italic">k</mml:mi><mml:mo fence="true" stretchy="false">]</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msub><mml:mo fence="true" stretchy="false">[</mml:mo><mml:mi mathvariant="italic">k</mml:mi><mml:mo mathvariant="normal">,</mml:mo><mml:mi mathvariant="italic">l</mml:mi><mml:mo fence="true" stretchy="false">]</mml:mo><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[ \begin{aligned}{}& Y=X\hspace{2.5pt}\underset{D}{\ast \ast }\hspace{2.5pt}\{{\phi _{m}}\},\\ {} & {Y_{m}}[i,j]={\sum \limits_{k}^{}}{\sum \limits_{l}^{}}X[Di+k,Dj+k]{\phi _{m}}[k,l].\end{aligned}\]]]></tex-math></alternatives>
</disp-formula> 
In Eq. (<xref rid="j_infor421_eq_005">5</xref>), decimation factor <italic>D</italic> equals the size of the block <italic>B</italic> and the double asterisk (<inline-formula id="j_infor421_ineq_020"><alternatives>
<mml:math><mml:munder><mml:mrow><mml:mo>∗</mml:mo><mml:mo>∗</mml:mo></mml:mrow><mml:mrow><mml:mi mathvariant="italic">D</mml:mi></mml:mrow></mml:munder></mml:math>
<tex-math><![CDATA[$\underset{D}{\ast \ast }$]]></tex-math></alternatives></inline-formula>) denotes a 2D convolutional operator decimated with the same factor. A two-dimensional measurement filter <inline-formula id="j_infor421_ineq_021"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\boldsymbol{\phi }_{m}}$]]></tex-math></alternatives></inline-formula> is created column-wise from the measurement vector <inline-formula id="j_infor421_ineq_022"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\boldsymbol{\phi }_{m}}$]]></tex-math></alternatives></inline-formula> as shown in Fig. <xref rid="j_infor421_fig_002">2</xref>. In Eq. (<xref rid="j_infor421_eq_005">5</xref>), <italic>Y</italic> denotes all the measurements obtained using decimated convolution over the whole input image <italic>X</italic> with the collection of measurement filters <inline-formula id="j_infor421_ineq_023"><alternatives>
<mml:math><mml:mo fence="true" stretchy="false">{</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msub><mml:mo fence="true" stretchy="false">}</mml:mo></mml:math>
<tex-math><![CDATA[$\{{\phi _{m}}\}$]]></tex-math></alternatives></inline-formula>. A visualization of the measurement process modelled using 2D convolution is shown in Fig. <xref rid="j_infor421_fig_003">3</xref>.</p>
<p>The CS reconstruction process is modelled using a transposed convolution (Dumoulin and Visin, <xref ref-type="bibr" rid="j_infor421_ref_016">2016</xref>), and the decoding part of the autoencoder is trained to learn the optimal pseudo-inverse linear mapping operator <inline-formula id="j_infor421_ineq_024"><alternatives>
<mml:math><mml:msup><mml:mrow><mml:mi mathvariant="bold">Φ</mml:mi></mml:mrow><mml:mrow><mml:mo>+</mml:mo></mml:mrow></mml:msup></mml:math>
<tex-math><![CDATA[${\boldsymbol{\Phi }^{+}}$]]></tex-math></alternatives></inline-formula> from the measurement data.</p>
<fig id="j_infor421_fig_002">
<label>Fig. 2</label>
<caption>
<p>Creating a set of measurement filters from the measurement matrix. Row vector <inline-formula id="j_infor421_ineq_025"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\boldsymbol{\phi }_{m}}$]]></tex-math></alternatives></inline-formula> is reshaped column-wise to create a measurement filter <inline-formula id="j_infor421_ineq_026"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\phi _{m}}$]]></tex-math></alternatives></inline-formula>. The first row vector <inline-formula id="j_infor421_ineq_027"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\boldsymbol{\phi }_{1}}$]]></tex-math></alternatives></inline-formula> of the measurement matrix is kept fixed during the training and corresponds to the measurement vector that calculates the mean value of the observed block. The measurement matrix <bold>Φ</bold> has <inline-formula id="j_infor421_ineq_028"><alternatives>
<mml:math><mml:mi mathvariant="italic">M</mml:mi><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:math>
<tex-math><![CDATA[$M-1$]]></tex-math></alternatives></inline-formula> rows that are optimized. The collection of measurement filters <inline-formula id="j_infor421_ineq_029"><alternatives>
<mml:math><mml:mo fence="true" stretchy="false">{</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msub><mml:mo fence="true" stretchy="false">}</mml:mo></mml:math>
<tex-math><![CDATA[$\{{\phi _{m}}\}$]]></tex-math></alternatives></inline-formula> has a depth size of <italic>M</italic> (i.e. <inline-formula id="j_infor421_ineq_030"><alternatives>
<mml:math><mml:mi mathvariant="italic">M</mml:mi><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:math>
<tex-math><![CDATA[$M-1$]]></tex-math></alternatives></inline-formula> trainable filters and one fixed filter).</p>
</caption>
<graphic xlink:href="infor421_g002.jpg"/>
</fig>
<fig id="j_infor421_fig_003">
<label>Fig. 3</label>
<caption>
<p>Visualization of the measurement process using decimated 2D convolution. Block <italic>x</italic> of size <inline-formula id="j_infor421_ineq_031"><alternatives>
<mml:math><mml:mi mathvariant="italic">B</mml:mi><mml:mo>×</mml:mo><mml:mi mathvariant="italic">B</mml:mi></mml:math>
<tex-math><![CDATA[$B\times B$]]></tex-math></alternatives></inline-formula> from the whole image <italic>X</italic> of size <inline-formula id="j_infor421_ineq_032"><alternatives>
<mml:math><mml:mi mathvariant="italic">N</mml:mi><mml:mo>×</mml:mo><mml:mi mathvariant="italic">N</mml:mi></mml:math>
<tex-math><![CDATA[$N\times N$]]></tex-math></alternatives></inline-formula> is convolved with a collection of measurement filters <inline-formula id="j_infor421_ineq_033"><alternatives>
<mml:math><mml:mo fence="true" stretchy="false">{</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msub><mml:mo fence="true" stretchy="false">}</mml:mo></mml:math>
<tex-math><![CDATA[$\{{\phi _{m}}\}$]]></tex-math></alternatives></inline-formula> of size <inline-formula id="j_infor421_ineq_034"><alternatives>
<mml:math><mml:mi mathvariant="italic">B</mml:mi><mml:mo>×</mml:mo><mml:mi mathvariant="italic">B</mml:mi><mml:mo>×</mml:mo><mml:mi mathvariant="italic">M</mml:mi></mml:math>
<tex-math><![CDATA[$B\times B\times M$]]></tex-math></alternatives></inline-formula>. This results in a measurement tensor <inline-formula id="j_infor421_ineq_035"><alternatives>
<mml:math><mml:mi mathvariant="bold-italic">y</mml:mi></mml:math>
<tex-math><![CDATA[$\boldsymbol{y}$]]></tex-math></alternatives></inline-formula> of size <inline-formula id="j_infor421_ineq_036"><alternatives>
<mml:math><mml:mn>1</mml:mn><mml:mo>×</mml:mo><mml:mn>1</mml:mn><mml:mo>×</mml:mo><mml:mi mathvariant="italic">M</mml:mi></mml:math>
<tex-math><![CDATA[$1\times 1\times M$]]></tex-math></alternatives></inline-formula>. A set of measurement tensors is denoted by <italic>Y</italic> and has a size of <inline-formula id="j_infor421_ineq_037"><alternatives>
<mml:math><mml:mstyle displaystyle="false"><mml:mfrac><mml:mrow><mml:mi mathvariant="italic">N</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">B</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>×</mml:mo><mml:mstyle displaystyle="false"><mml:mfrac><mml:mrow><mml:mi mathvariant="italic">N</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">B</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>×</mml:mo><mml:mi mathvariant="italic">M</mml:mi></mml:math>
<tex-math><![CDATA[$\frac{N}{B}\times \frac{N}{B}\times M$]]></tex-math></alternatives></inline-formula>.</p>
</caption>
<graphic xlink:href="infor421_g003.jpg"/>
</fig>
</sec>
<sec id="j_infor421_s_005">
<label>3.2</label>
<title>Predefined vs. Adaptive Measurement Matrix</title>
<p>There are two basic approaches for the measurement matrix design. An arbitrary measurement matrix <bold>Φ</bold> can be used in the measurement process to obtain measurements <inline-formula id="j_infor421_ineq_038"><alternatives>
<mml:math><mml:mi mathvariant="bold-italic">y</mml:mi></mml:math>
<tex-math><![CDATA[$\boldsymbol{y}$]]></tex-math></alternatives></inline-formula> from the input images. In the traditional CS, the measurement matrix with independent and identically distributed (i.i.d.) Gaussian measurement vectors is often used. In that case, the encoding layer of the autoencoder is initialized using the weights defined by the vectors from the measurement matrix <bold>Φ</bold> and is kept fixed during the training process. A signal dimensionality reduction using a predefined (e.q. random Gaussian, Hadamard, DCT) measurement matrix <bold>Φ</bold> is sub-optimal due to the fact that it does not exploit the underlying structure of the observed signal.</p>
<p>Alternatively, the optimal measurement matrix can be inferred from the training data. Such a matrix better adapts to the dataset and preserves more information in the measurements, resulting in better reconstruction results. In our proposal, we optimize the encoding part of the autoencoder to learn the optimal linear measurement matrix <bold>Φ</bold> from the training dataset. In the experimental section, we show the effect of the measurement matrix choice on the reconstruction results.</p>
</sec>
<sec id="j_infor421_s_006">
<label>3.3</label>
<title>Network Training Using Normalized Measurements</title>
<p>Training neural networks on normalized, mean-centred data became standard in all areas of machine learning (Ioffe and Szegedy, <xref ref-type="bibr" rid="j_infor421_ref_020">2015</xref>). It is well known that such practice significantly reduces the training time, but the application to the learning based CS is not straightforward. The measurement process needs to be redesigned in order to obtain normalized and mean-centred measurements, since the mean value of the observed signal has to be measured during the CS acquisition process. In this section, we present an efficient measurement process which enables the direct application of data normalization techniques, which is in contrast with the previous work in this area.</p>
<p>In order to measure the mean value <inline-formula id="j_infor421_ineq_039"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">y</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${y_{1}}$]]></tex-math></alternatives></inline-formula> of the observed block (Fig. <xref rid="j_infor421_fig_002">2</xref>), we fix the first row of the measurement matrix <bold>Φ</bold>, so that it corresponds to a row vector containing all ones: 
<disp-formula id="j_infor421_eq_006">
<label>(6)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi mathvariant="italic">y</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mi mathvariant="italic">B</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:mstyle>
<mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mstyle displaystyle="true"><mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mi mathvariant="italic">B</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:munderover><mml:msub><mml:mrow><mml:mi mathvariant="italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo mathvariant="normal">,</mml:mo><mml:mi mathvariant="italic">i</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi mathvariant="italic">x</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mi mathvariant="italic">B</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:mstyle>
<mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mstyle displaystyle="true"><mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mi mathvariant="italic">B</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:munderover><mml:msub><mml:mrow><mml:mi mathvariant="italic">x</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[ {y_{1}}=\frac{1}{{B^{2}}}{\sum \limits_{i=1}^{{B^{2}}}}{\phi _{1,i}}{x_{i}}=\frac{1}{{B^{2}}}{\sum \limits_{i=1}^{{B^{2}}}}{x_{i}}.\]]]></tex-math></alternatives>
</disp-formula> 
The rest of the matrix (<inline-formula id="j_infor421_ineq_040"><alternatives>
<mml:math><mml:mi mathvariant="italic">M</mml:mi><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:math>
<tex-math><![CDATA[$M-1$]]></tex-math></alternatives></inline-formula> rows) is left to be optimized in the training procedure. We mean-centre the normalized measurements <inline-formula id="j_infor421_ineq_041"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">y</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${y_{m}}$]]></tex-math></alternatives></inline-formula> using the obtained mean measurement <inline-formula id="j_infor421_ineq_042"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">y</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${y_{1}}$]]></tex-math></alternatives></inline-formula>: 
<disp-formula id="j_infor421_eq_007">
<label>(7)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi mathvariant="italic">y</mml:mi></mml:mrow><mml:mo stretchy="false">ˆ</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msubsup><mml:mrow><mml:mo largeop="false" movablelimits="false">∑</mml:mo></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mi mathvariant="italic">B</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:msubsup><mml:msub><mml:mrow><mml:mi mathvariant="italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi><mml:mo mathvariant="normal">,</mml:mo><mml:mi mathvariant="italic">i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mstyle><mml:msub><mml:mrow><mml:mi mathvariant="italic">y</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">m</mml:mi></mml:mrow></mml:msub><mml:mo>−</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="italic">y</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo mathvariant="normal">,</mml:mo><mml:mspace width="1em"/><mml:mi mathvariant="italic">m</mml:mi><mml:mo stretchy="false">∈</mml:mo><mml:mo fence="true" stretchy="false">[</mml:mo><mml:mn>2</mml:mn><mml:mo mathvariant="normal">,</mml:mo><mml:mi mathvariant="italic">M</mml:mi><mml:mo fence="true" stretchy="false">]</mml:mo><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[ {\hat{y}_{m}}=\frac{1}{{\textstyle\textstyle\sum _{i=1}^{{B^{2}}}}{\phi _{m,i}}}{y_{m}}-{y_{1}},\hspace{1em}m\in [2,M].\]]]></tex-math></alternatives>
</disp-formula> 
The decoding part of the network is trained using the mean-centred measurement vector <inline-formula id="j_infor421_ineq_043"><alternatives>
<mml:math><mml:mover accent="true"><mml:mrow><mml:mi mathvariant="bold-italic">y</mml:mi></mml:mrow><mml:mo stretchy="false">ˆ</mml:mo></mml:mover></mml:math>
<tex-math><![CDATA[$\hat{\boldsymbol{y}}$]]></tex-math></alternatives></inline-formula> as its input, and it results in the mean-centred image reconstruction. In order to obtain the final image reconstruction, the mean value for each image block is restored by adding <inline-formula id="j_infor421_ineq_044"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">y</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${y_{1}}$]]></tex-math></alternatives></inline-formula> to each block.</p>
<p>Training the neural network on non-mean-centred data has undesirable consequences. If the data coming into a neuron is always positive (e.q. <inline-formula id="j_infor421_ineq_045"><alternatives>
<mml:math><mml:mi mathvariant="italic">x</mml:mi><mml:mo mathvariant="normal">&gt;</mml:mo><mml:mn>0</mml:mn></mml:math>
<tex-math><![CDATA[$x>0$]]></tex-math></alternatives></inline-formula> element-wise in <inline-formula id="j_infor421_ineq_046"><alternatives>
<mml:math><mml:mi mathvariant="italic">f</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="italic">w</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msup><mml:mi mathvariant="italic">x</mml:mi><mml:mo>+</mml:mo><mml:mi mathvariant="italic">b</mml:mi></mml:math>
<tex-math><![CDATA[$f={w^{T}}x+b$]]></tex-math></alternatives></inline-formula>), then the gradient on the weights <italic>w</italic> becomes either all-positive, or all-negative (depending on the gradient of the whole expression <italic>f</italic>) during the back-propagation step. In return, this could introduce the undesirable zig-zagging dynamics in the gradient updates of the weights (Karpathy, <xref ref-type="bibr" rid="j_infor421_ref_022">2017</xref>). As shown in Fig. <xref rid="j_infor421_fig_004">4</xref>, zig-zagging is also manifested in the loss function. The training loss function for the unnormalized measurements (red dashed line) and normalized measurements (blue solid line) are shown in the <italic>log</italic> scale simultaneously. Notice that the loss function for the proposed network that is trained on mean-centred data converges significantly faster than the network trained on non-centred measurements.</p>
<fig id="j_infor421_fig_004">
<label>Fig. 4</label>
<caption>
<p>Training loss function. Normalized mean-centred measurements vs. original measurements. Notice the zig-zagging in the loss function when using non-centred measurement data. Loss functions are visualized on the log scale.</p>
</caption>
<graphic xlink:href="infor421_g004.jpg"/>
</fig>
</sec>
<sec id="j_infor421_s_007">
<label>3.4</label>
<title>Efficient Method for Network Initialization</title>
<p>In Lohit <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_026">2018</xref>) and Du <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_013">2019</xref>), the authors optimize the linear encoder in order to infer the optimal measurement matrix for each measurement rate <italic>r</italic>. In Baldi and Hornik (<xref ref-type="bibr" rid="j_infor421_ref_002">1989</xref>), it has been shown that the linear autoencoder with the mean squared error (MSE) loss converges to a unique minimum corresponding to the projection onto the subspace generated by the first principal component vectors of the covariance matrix obtained using the principal component analysis (PCA). Thus, it is sub-optimal to retrain the model for each measurement rate <italic>r</italic>.</p>
<p>Instead, we propose an efficient initialization method for the deep learning CS models based on the observation from Baldi and Hornik (<xref ref-type="bibr" rid="j_infor421_ref_002">1989</xref>). Principal component analysis (PCA) is an analytic method that has a widespread use in dimensionality reduction. The PCA is performed on the covariance matrix of the data vector <inline-formula id="j_infor421_ineq_047"><alternatives>
<mml:math><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math>
<tex-math><![CDATA[$\boldsymbol{x}$]]></tex-math></alternatives></inline-formula>: 
<disp-formula id="j_infor421_eq_008">
<label>(8)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:mi mathvariant="italic">C</mml:mi><mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mi mathvariant="italic">E</mml:mi><mml:mo fence="true" maxsize="1.19em" minsize="1.19em">[</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:msup><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msup><mml:mo fence="true" maxsize="1.19em" minsize="1.19em">]</mml:mo><mml:mo>−</mml:mo><mml:mi mathvariant="italic">E</mml:mi><mml:mo fence="true" stretchy="false">[</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo fence="true" stretchy="false">]</mml:mo><mml:mi mathvariant="italic">E</mml:mi><mml:mo fence="true" maxsize="1.19em" minsize="1.19em">[</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msup><mml:mo fence="true" maxsize="1.19em" minsize="1.19em">]</mml:mo><mml:mo mathvariant="normal">,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[ C(\boldsymbol{x})=E\big[\boldsymbol{x}{\boldsymbol{x}^{T}}\big]-E[\boldsymbol{x}]E\big[{\boldsymbol{x}^{T}}\big],\]]]></tex-math></alternatives>
</disp-formula> 
where <italic>E</italic> denotes the expectation operator. In the case when images are the signals of interest, PCA is performed by calculating an unbiased estimate of the covariance matrix <inline-formula id="j_infor421_ineq_048"><alternatives>
<mml:math><mml:mi mathvariant="italic">C</mml:mi><mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math>
<tex-math><![CDATA[$C(\boldsymbol{x})$]]></tex-math></alternatives></inline-formula> for the vectorized images, where <inline-formula id="j_infor421_ineq_049"><alternatives>
<mml:math><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math>
<tex-math><![CDATA[$\boldsymbol{x}$]]></tex-math></alternatives></inline-formula> is a flattened image vector, and <inline-formula id="j_infor421_ineq_050"><alternatives>
<mml:math><mml:mover accent="true"><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow><mml:mo stretchy="false">¯</mml:mo></mml:mover></mml:math>
<tex-math><![CDATA[$\bar{\boldsymbol{x}}$]]></tex-math></alternatives></inline-formula> is its mean value: 
<disp-formula id="j_infor421_eq_009">
<label>(9)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:mi mathvariant="italic">C</mml:mi><mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi mathvariant="italic">N</mml:mi><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:mfrac></mml:mstyle>
<mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mstyle displaystyle="true"><mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle></mml:mrow><mml:mrow><mml:mi mathvariant="italic">i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi mathvariant="italic">N</mml:mi></mml:mrow></mml:munderover><mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">n</mml:mi></mml:mrow></mml:msub><mml:mo>−</mml:mo><mml:mover accent="true"><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow><mml:mo stretchy="false">¯</mml:mo></mml:mover><mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo><mml:msup><mml:mrow><mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">n</mml:mi></mml:mrow></mml:msub><mml:mo>−</mml:mo><mml:mover accent="true"><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow><mml:mo stretchy="false">¯</mml:mo></mml:mover><mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msup><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[ C(\boldsymbol{x})=\frac{1}{N-1}{\sum \limits_{i=1}^{N}}({\boldsymbol{x}_{n}}-\bar{\boldsymbol{x}}){({\boldsymbol{x}_{n}}-\bar{\boldsymbol{x}})^{T}}.\]]]></tex-math></alternatives>
</disp-formula> 
After applying the eigendecomposition (Eq. (<xref rid="j_infor421_eq_010">10</xref>)) to the estimate of the covariance matrix <inline-formula id="j_infor421_ineq_051"><alternatives>
<mml:math><mml:mi mathvariant="italic">C</mml:mi><mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math>
<tex-math><![CDATA[$C(\boldsymbol{x})$]]></tex-math></alternatives></inline-formula> for the observed images, an eigenvalue matrix <bold>Σ</bold> contains positive eigenvalues <italic>λ</italic> sorted in a descending order. The eigenvalues explain the variance in the direction of corresponding eigenvector in the orthonormal matrix <inline-formula id="j_infor421_ineq_052"><alternatives>
<mml:math><mml:mi mathvariant="bold-italic">U</mml:mi></mml:math>
<tex-math><![CDATA[$\boldsymbol{U}$]]></tex-math></alternatives></inline-formula>. Under the assumption that the variance reflects the informational content, a subset of <italic>M</italic> eigenvectors with the largest eigenvalues (i.e. principal components) optimally describes the observed signal in terms of the mean squared error: 
<disp-formula id="j_infor421_eq_010">
<label>(10)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:mi mathvariant="italic">C</mml:mi><mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mi mathvariant="bold-italic">U</mml:mi><mml:mi mathvariant="bold">Σ</mml:mi><mml:msup><mml:mrow><mml:mi mathvariant="bold-italic">U</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msup><mml:mo stretchy="false">≈</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">U</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>:</mml:mo><mml:mi mathvariant="italic">M</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>:</mml:mo><mml:mi mathvariant="italic">M</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mrow><mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">U</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>:</mml:mo><mml:mi mathvariant="italic">M</mml:mi></mml:mrow></mml:msub><mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msup><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[ C(\boldsymbol{x})=\boldsymbol{U}\boldsymbol{\Sigma }{\boldsymbol{U}^{T}}\approx {\boldsymbol{U}_{1:M}}{\boldsymbol{\Sigma }_{1:M}}{({\boldsymbol{U}_{1:M}})^{T}}.\]]]></tex-math></alternatives>
</disp-formula> 
If the training dataset is formed to faithfully represent the image statistics, the reduced eigenvector matrix <inline-formula id="j_infor421_ineq_053"><alternatives>
<mml:math><mml:msubsup><mml:mrow><mml:mi mathvariant="bold-italic">U</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>:</mml:mo><mml:mi mathvariant="italic">M</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msubsup></mml:math>
<tex-math><![CDATA[${\boldsymbol{U}_{1:M}^{T}}$]]></tex-math></alternatives></inline-formula> optimally preserves the informational content of the observed image blocks.</p>
<p>Thus, we propose to use the reduced eigenvector matrix <inline-formula id="j_infor421_ineq_054"><alternatives>
<mml:math><mml:msubsup><mml:mrow><mml:mi mathvariant="bold-italic">U</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>:</mml:mo><mml:mi mathvariant="italic">M</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msubsup></mml:math>
<tex-math><![CDATA[${\boldsymbol{U}_{1:M}^{T}}$]]></tex-math></alternatives></inline-formula> to initialize the weights of the encoding part of the CS model: 
<disp-formula id="j_infor421_eq_011">
<label>(11)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:mi mathvariant="bold">Φ</mml:mi><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mi mathvariant="bold-italic">U</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>:</mml:mo><mml:mi mathvariant="italic">M</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msubsup><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[ \boldsymbol{\Phi }={\boldsymbol{U}_{1:M}^{T}}.\]]]></tex-math></alternatives>
</disp-formula>
</p>
<p>Furthermore, we propose to initialize the reconstruction part of the network using the PCA as well. The eigenvector matrix <inline-formula id="j_infor421_ineq_055"><alternatives>
<mml:math><mml:mi mathvariant="bold-italic">U</mml:mi></mml:math>
<tex-math><![CDATA[$\boldsymbol{U}$]]></tex-math></alternatives></inline-formula> is a unitary matrix. If the measurement matrix <bold>Φ</bold> is equal to the reduced eigenvector matrix <inline-formula id="j_infor421_ineq_056"><alternatives>
<mml:math><mml:msubsup><mml:mrow><mml:mi mathvariant="bold-italic">U</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>:</mml:mo><mml:mi mathvariant="italic">M</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msubsup></mml:math>
<tex-math><![CDATA[${\boldsymbol{U}_{1:M}^{T}}$]]></tex-math></alternatives></inline-formula> as in our proposal, we can write: 
<disp-formula id="j_infor421_eq_012">
<label>(12)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mo>=</mml:mo><mml:mi mathvariant="bold">Φ</mml:mi><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mi mathvariant="bold-italic">U</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>:</mml:mo><mml:mi mathvariant="italic">M</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msubsup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[ \boldsymbol{y}=\boldsymbol{\Phi }\boldsymbol{x}={\boldsymbol{U}_{1:M}^{T}}\boldsymbol{x}.\]]]></tex-math></alternatives>
</disp-formula> 
The original image <inline-formula id="j_infor421_ineq_057"><alternatives>
<mml:math><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math>
<tex-math><![CDATA[$\boldsymbol{x}$]]></tex-math></alternatives></inline-formula> can be reconstructed using the pseudo-inverse of the measurement matrix: 
<disp-formula id="j_infor421_eq_013">
<label>(13)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true" columnalign="right left" columnspacing="0pt"><mml:mtr><mml:mtd class="align-odd"><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mtd><mml:mtd class="align-even"><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="bold">Φ</mml:mi></mml:mrow><mml:mrow><mml:mo>+</mml:mo></mml:mrow></mml:msup><mml:mi mathvariant="bold-italic">y</mml:mi></mml:mtd></mml:mtr><mml:mtr><mml:mtd class="align-odd"/><mml:mtd class="align-even"><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo><mml:mi mathvariant="bold">Φ</mml:mi><mml:msup><mml:mrow><mml:mi mathvariant="bold">Φ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msup><mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo></mml:mrow><mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mi mathvariant="bold">Φ</mml:mi><mml:mi mathvariant="bold-italic">y</mml:mi></mml:mtd></mml:mtr><mml:mtr><mml:mtd class="align-odd"/><mml:mtd class="align-even"><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mo fence="true" maxsize="1.19em" minsize="1.19em">[</mml:mo><mml:msup><mml:mrow><mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">U</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>:</mml:mo><mml:mi mathvariant="italic">M</mml:mi></mml:mrow></mml:msub><mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msup><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">U</mml:mi></mml:mrow><mml:mrow><mml:mn mathvariant="bold">1</mml:mn><mml:mi mathvariant="bold">:</mml:mi><mml:mi mathvariant="bold-italic">M</mml:mi></mml:mrow></mml:msub><mml:mo fence="true" maxsize="1.19em" minsize="1.19em">]</mml:mo></mml:mrow><mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:msup><mml:mrow><mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">U</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>:</mml:mo><mml:mi mathvariant="italic">M</mml:mi></mml:mrow></mml:msub><mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msup><mml:mi mathvariant="bold-italic">y</mml:mi></mml:mtd></mml:mtr><mml:mtr><mml:mtd class="align-odd"/><mml:mtd class="align-even"><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="bold-italic">U</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>:</mml:mo><mml:mi mathvariant="italic">M</mml:mi></mml:mrow></mml:msub><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[\begin{aligned}{}\boldsymbol{x}& ={\boldsymbol{\Phi }^{+}}\boldsymbol{y}\\ {} & ={\big(\boldsymbol{\Phi }{\boldsymbol{\Phi }^{T}}\big)^{-1}}\boldsymbol{\Phi }\boldsymbol{y}\\ {} & ={\big[{({\boldsymbol{U}_{1:M}})^{T}}{\boldsymbol{U}_{\mathbf{1}\mathbf{:}\boldsymbol{M}}}\big]^{-1}}{({\boldsymbol{U}_{1:M}})^{T}}\boldsymbol{y}\\ {} & ={\boldsymbol{U}_{1:M}}\boldsymbol{y}.\end{aligned}\]]]></tex-math></alternatives>
</disp-formula> 
Since <inline-formula id="j_infor421_ineq_058"><alternatives>
<mml:math><mml:msup><mml:mrow><mml:mi mathvariant="bold-italic">U</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">T</mml:mi></mml:mrow></mml:msup></mml:math>
<tex-math><![CDATA[${\boldsymbol{U}^{T}}$]]></tex-math></alternatives></inline-formula> is a unitary matrix, the pseudo-inverse matrix <inline-formula id="j_infor421_ineq_059"><alternatives>
<mml:math><mml:msup><mml:mrow><mml:mi mathvariant="bold">Φ</mml:mi></mml:mrow><mml:mrow><mml:mo>+</mml:mo></mml:mrow></mml:msup></mml:math>
<tex-math><![CDATA[${\boldsymbol{\Phi }^{+}}$]]></tex-math></alternatives></inline-formula> for the CS reconstruction becomes just a transposition of the measurement matrix. This results in an efficient method for initialization of neural network weights for both the encoding and decoding part of the learning based CS models.</p>
<p>The proposed initialization method for the network weights has several advantages. While a neural network has to be retrained in order to obtain the measurement matrix <bold>Φ</bold> for a different sub-rate <italic>r</italic>, the PCA approach outputs the whole eigenvector matrix <inline-formula id="j_infor421_ineq_060"><alternatives>
<mml:math><mml:mi mathvariant="bold-italic">U</mml:mi></mml:math>
<tex-math><![CDATA[$\boldsymbol{U}$]]></tex-math></alternatives></inline-formula>. Thus, for any measurement rate the initial measurement matrix <bold>Φ</bold> can be formed by selecting a subset of <italic>M</italic> largest eigenvectors and one can use them in order to initialize the model. The learning based approach is significantly slower since it is extremely hard to learn the optimal measurement operator and the network might not fully converge. Contrary, in the case of linear autoencoder, we obtain the exact solution for optimal measurement and reconstruction operator in a fraction of time needed to train the neural network. Using the PCA initialization for the autoencoder might be beneficial even when the loss function in the training procedure is not pixel-wise Euclidean and when additional regularization is introduced in the training procedure.</p>
</sec>
<sec id="j_infor421_s_008">
<label>3.5</label>
<title>Residual Network</title>
<p>As previously mentioned, the first part of the proposed network consists of a linear autoencoder. Non-linearities can be easily introduced into the measurement and reconstruction part of the network to further improve the initial reconstruction obtained by the autoencoder. In our proposal, non-linearities are only introduced into the decoding part of the network. Although there are some methods that learn a non-linear measurement operator from the data (Mousavi <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_030">2017</xref>), linearity is an important property of measurement systems and we want our CS model to be realizable in real physical measurement setups like Takhar <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_038">2006</xref>) and Ralašić <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_035">2018</xref>).</p>
<fig id="j_infor421_fig_005">
<label>Fig. 5</label>
<caption>
<p>Contrast-adjusted visualization of the learned residual for several test images and for the measurement ratio <inline-formula id="j_infor421_ineq_061"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo>=</mml:mo><mml:mn>0.25</mml:mn></mml:math>
<tex-math><![CDATA[$r=0.25$]]></tex-math></alternatives></inline-formula>: 1) <italic>Barbara</italic>, 2) <italic>Parrot</italic>, 3) <italic>Peppers</italic>. Notice that the residual network improves the preliminary reconstructions in aspects of blocking artifacts, high frequency content restoration and edge preservation.</p>
</caption>
<graphic xlink:href="infor421_g005.jpg"/>
</fig>
<p>The output of the proposed convolutional autoencoder represents a preliminary reconstruction of the input image from its low-dimensional measurements. We feed the preliminary reconstruction to a residual network (He <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_019">2015</xref>) that induces non-linearity and reduces potential reconstruction and blocking artifacts, and eliminates the need for an off-the-shelf denoiser such as BM3D (Dabov <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_009">2009</xref>) used in the competitive methods. Figure <xref rid="j_infor421_fig_005">5</xref> shows several examples of the estimated residual. Residual learning compensates for some of the high-frequency loss and improves the initial image reconstruction.</p>
<p>Figure <xref rid="j_infor421_fig_006">6</xref> shows the architecture of the residual learning block used in our proposal. The residual network consists of two residual learning blocks and each residual learning block has three layers. The first layer consists of 16 convolutional filters of size <inline-formula id="j_infor421_ineq_062"><alternatives>
<mml:math><mml:mn>3</mml:mn><mml:mo>×</mml:mo><mml:mn>3</mml:mn></mml:math>
<tex-math><![CDATA[$3\times 3$]]></tex-math></alternatives></inline-formula> with stride 1, followed by a ReLU non-linearity. The second layer has <inline-formula id="j_infor421_ineq_063"><alternatives>
<mml:math><mml:mn>3</mml:mn><mml:mo>×</mml:mo><mml:mn>3</mml:mn><mml:mo>×</mml:mo><mml:mn>32</mml:mn></mml:math>
<tex-math><![CDATA[$3\times 3\times 32$]]></tex-math></alternatives></inline-formula> filters with stride 1, also followed by a ReLU non-linearity. The final layer consists of a single filter of size <inline-formula id="j_infor421_ineq_064"><alternatives>
<mml:math><mml:mn>3</mml:mn><mml:mo>×</mml:mo><mml:mn>3</mml:mn></mml:math>
<tex-math><![CDATA[$3\times 3$]]></tex-math></alternatives></inline-formula>, which outputs the inferred residual image. Image dimensions are preserved in each layer by the appropriate zero-padding. Identity shortcuts are added to each residual block and are used to propagate the intermediate image reconstructions.</p>
</sec>
<sec id="j_infor421_s_009">
<label>3.6</label>
<title>Choice of the Loss Function</title>
<p>Reconstructing the high-frequency content in the original image (i.e. edges, texture) is problematic for the linear autoencoder, and the residual network helps to alleviate this problem. Problems occur partly due to the fact that the lower frequency content is dominant in natural images and the learned measurement filters have a low-pass character, and partly due to the choice of the loss function used for training the network. It is known that the MSE loss function yields blurry images (Kristiadi, <xref ref-type="bibr" rid="j_infor421_ref_024">2019</xref>). Thus, some papers suggest using a different loss function for the network training. As an example, Lohit <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_026">2018</xref>) uses the adversarial loss function in addition to Euclidean loss to obtain better and sharper reconstructions. Furthermore, Du <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_012">2018</xref>) uses perceptual loss in order to achieve better reconstruction results. The authors train their model using the Euclidean loss in the latent space of the <inline-formula id="j_infor421_ineq_065"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mtext mathvariant="italic">VGG</mml:mtext></mml:mrow><mml:mrow><mml:mn>19</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\textit{VGG}_{19}}$]]></tex-math></alternatives></inline-formula> neural network (Simonyan and Zisserman, <xref ref-type="bibr" rid="j_infor421_ref_036">2014</xref>).</p>
<p>In this paper, we fuse the per-pixel reconstruction loss in the autoencoder with the perceptual loss in latent space in the residual network. This is in contrast with Du <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_012">2018</xref>), where the authors optimize the whole network using the Euclidean loss in the latent space. As a consequence, their method results in semantically informative reconstructions, but with low per-pixel accuracy. By using a combination of Euclidean and perceptual loss, we obtain semantically informative reconstructions that have high accuracy of per-pixel reconstruction resulting in higher PSNR compared to Du <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_012">2018</xref>).</p>
<fig id="j_infor421_fig_006">
<label>Fig. 6</label>
<caption>
<p>Residual learning block. The residual learning block consists of 3 convolutional layers.</p>
</caption>
<graphic xlink:href="infor421_g006.jpg"/>
</fig>
<p>Pixel-wise Euclidean loss function for the autoencoder is defined as: 
<disp-formula id="j_infor421_eq_014">
<label>(14)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo><mml:mo fence="true" maxsize="1.19em" minsize="1.19em">{</mml:mo><mml:mi mathvariant="bold">Φ</mml:mi><mml:mo mathvariant="normal">,</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="bold">Φ</mml:mi></mml:mrow><mml:mrow><mml:mo>+</mml:mo></mml:mrow></mml:msup><mml:mo fence="true" maxsize="1.19em" minsize="1.19em">}</mml:mo><mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo><mml:mo>=</mml:mo><mml:mo maxsize="1.19em" minsize="1.19em" stretchy="true">‖</mml:mo><mml:mi mathvariant="italic">x</mml:mi><mml:mo>−</mml:mo><mml:mi mathvariant="italic">f</mml:mi><mml:mo fence="true" maxsize="1.19em" minsize="1.19em">{</mml:mo><mml:mi mathvariant="italic">x</mml:mi><mml:mo mathvariant="normal">,</mml:mo><mml:mo fence="true" maxsize="1.19em" minsize="1.19em">{</mml:mo><mml:mi mathvariant="bold">Φ</mml:mi><mml:mo mathvariant="normal">,</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="bold">Φ</mml:mi></mml:mrow><mml:mrow><mml:mo>+</mml:mo></mml:mrow></mml:msup><mml:mo fence="true" maxsize="1.19em" minsize="1.19em">}</mml:mo><mml:mo fence="true" maxsize="1.19em" minsize="1.19em">}</mml:mo><mml:msubsup><mml:mrow><mml:mo maxsize="1.19em" minsize="1.19em" stretchy="true">‖</mml:mo></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo mathvariant="normal">,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[ {\mathcal{L}_{1}}\big(\big\{\boldsymbol{\Phi },{\boldsymbol{\Phi }^{+}}\big\}\big)=\big\| x-f\big\{x,\big\{\boldsymbol{\Phi },{\boldsymbol{\Phi }^{+}}\big\}\big\}{\big\| _{2}^{2}},\]]]></tex-math></alternatives>
</disp-formula> 
where <bold>Φ</bold> denotes the weights of the measurement operator, <inline-formula id="j_infor421_ineq_066"><alternatives>
<mml:math><mml:msup><mml:mrow><mml:mi mathvariant="bold">Φ</mml:mi></mml:mrow><mml:mrow><mml:mo>+</mml:mo></mml:mrow></mml:msup></mml:math>
<tex-math><![CDATA[${\boldsymbol{\Phi }^{+}}$]]></tex-math></alternatives></inline-formula> are the weights of the reconstruction operator, <italic>x</italic> is the original image and <inline-formula id="j_infor421_ineq_067"><alternatives>
<mml:math><mml:mi mathvariant="italic">f</mml:mi><mml:mo fence="true" stretchy="false">{</mml:mo><mml:mi mathvariant="italic">x</mml:mi><mml:mo mathvariant="normal">,</mml:mo><mml:mo fence="true" stretchy="false">{</mml:mo><mml:mi mathvariant="bold">Φ</mml:mi><mml:mo mathvariant="normal">,</mml:mo><mml:msup><mml:mrow><mml:mi mathvariant="bold">Φ</mml:mi></mml:mrow><mml:mrow><mml:mo>+</mml:mo></mml:mrow></mml:msup><mml:mo fence="true" stretchy="false">}</mml:mo><mml:mo fence="true" stretchy="false">}</mml:mo></mml:math>
<tex-math><![CDATA[$f\{x,\{\boldsymbol{\Phi },{\boldsymbol{\Phi }^{+}}\}\}$]]></tex-math></alternatives></inline-formula> is the image reconstruction obtained by the autoencoder.</p>
<p>The residual part of the proposed network is trained separately from the autoencoder part using perceptual loss function <inline-formula id="j_infor421_ineq_068"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mathcal{L}_{2}}$]]></tex-math></alternatives></inline-formula> (Eq. (<xref rid="j_infor421_eq_015">15</xref>)) in the latent space of the <inline-formula id="j_infor421_ineq_069"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mtext mathvariant="italic">VGG</mml:mtext></mml:mrow><mml:mrow><mml:mn>19</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\textit{VGG}_{19}}$]]></tex-math></alternatives></inline-formula> network similarly to Du <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_012">2018</xref>). In contrast with Du <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor421_ref_012">2018</xref>), we use a linear combination of Euclidean losses defined on the features of second and third max-pooling layer of the <inline-formula id="j_infor421_ineq_070"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mtext mathvariant="italic">VGG</mml:mtext></mml:mrow><mml:mrow><mml:mn>19</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\textit{VGG}_{19}}$]]></tex-math></alternatives></inline-formula> network instead of the Euclidean loss on individual feature map. The motivation for this is to simultaneously reconstruct both the low-level information contained in the bottom layers, as well as the high-level semantic features contained in the top layers of the <inline-formula id="j_infor421_ineq_071"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mtext mathvariant="italic">VGG</mml:mtext></mml:mrow><mml:mrow><mml:mn>19</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\textit{VGG}_{19}}$]]></tex-math></alternatives></inline-formula> network. 
<disp-formula id="j_infor421_eq_015">
<label>(15)</label><alternatives>
<mml:math display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo><mml:mo fence="true" stretchy="false">{</mml:mo><mml:mi mathvariant="bold-italic">W</mml:mi><mml:mo fence="true" stretchy="false">}</mml:mo><mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:mfrac></mml:mstyle>
<mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mstyle displaystyle="true"><mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle></mml:mrow><mml:mrow><mml:mi mathvariant="italic">j</mml:mi><mml:mo>=</mml:mo><mml:mn>2</mml:mn></mml:mrow><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:munderover><mml:mo maxsize="1.19em" minsize="1.19em" stretchy="true">‖</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">j</mml:mi></mml:mrow></mml:msub><mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo><mml:mi mathvariant="italic">x</mml:mi><mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo><mml:mo>−</mml:mo><mml:msub><mml:mrow><mml:mi mathvariant="italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">j</mml:mi></mml:mrow></mml:msub><mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo><mml:mi mathvariant="italic">f</mml:mi><mml:mo fence="true" stretchy="false">{</mml:mo><mml:mi mathvariant="italic">x</mml:mi><mml:mo mathvariant="normal">,</mml:mo><mml:mi mathvariant="bold-italic">W</mml:mi><mml:mo fence="true" stretchy="false">}</mml:mo><mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo><mml:msubsup><mml:mrow><mml:mo maxsize="1.19em" minsize="1.19em" stretchy="true">‖</mml:mo></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math>
<tex-math><![CDATA[\[ {\mathcal{L}_{2}}\big(\{\boldsymbol{W}\}\big)=\frac{1}{2}{\sum \limits_{j=2}^{3}}\big\| {\phi _{j}}(x)-{\phi _{j}}\big(f\{x,\boldsymbol{W}\}\big){\big\| _{2}^{2}}.\]]]></tex-math></alternatives>
</disp-formula> 
In Eq. (<xref rid="j_infor421_eq_015">15</xref>), <inline-formula id="j_infor421_ineq_072"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">ϕ</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">j</mml:mi></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\phi _{j}}$]]></tex-math></alternatives></inline-formula> denotes the feature map of the <italic>j</italic>-th layer of the <inline-formula id="j_infor421_ineq_073"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mtext mathvariant="italic">VGG</mml:mtext></mml:mrow><mml:mrow><mml:mn>19</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\textit{VGG}_{19}}$]]></tex-math></alternatives></inline-formula> with input <italic>x</italic>. Furthermore, <inline-formula id="j_infor421_ineq_074"><alternatives>
<mml:math><mml:mi mathvariant="bold-italic">W</mml:mi></mml:math>
<tex-math><![CDATA[$\boldsymbol{W}$]]></tex-math></alternatives></inline-formula> denotes filter weights in the residual network and <inline-formula id="j_infor421_ineq_075"><alternatives>
<mml:math><mml:mi mathvariant="italic">f</mml:mi><mml:mo fence="true" stretchy="false">{</mml:mo><mml:mi mathvariant="italic">x</mml:mi><mml:mo mathvariant="normal">,</mml:mo><mml:mi mathvariant="bold-italic">W</mml:mi><mml:mo fence="true" stretchy="false">}</mml:mo></mml:math>
<tex-math><![CDATA[$f\{x,\boldsymbol{W}\}$]]></tex-math></alternatives></inline-formula> is the final image reconstruction.</p>
</sec>
</sec>
<sec id="j_infor421_s_010">
<label>4</label>
<title>Experiments</title>
<sec id="j_infor421_s_011">
<label>4.1</label>
<title>Network Training</title>
<p>In this section, we discuss the details of our network training procedure. We use <italic>tensorflow</italic> (Abadi <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_001">2015</xref>) deep learning framework for training and testing purposes. The training dataset is formed using uncalibrated JPEG images from the publicly available <italic>Barcelona Calibrated Images Database</italic> (Párraga <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_032">2010</xref>). Our training dataset is created by extracting 1676 image patches of size <inline-formula id="j_infor421_ineq_076"><alternatives>
<mml:math><mml:mn>256</mml:mn><mml:mo>×</mml:mo><mml:mn>256</mml:mn></mml:math>
<tex-math><![CDATA[$256\times 256$]]></tex-math></alternatives></inline-formula>, taken from different parts of the original high-resolution (<inline-formula id="j_infor421_ineq_077"><alternatives>
<mml:math><mml:mn>2268</mml:mn><mml:mo>×</mml:mo><mml:mn>1512</mml:mn></mml:math>
<tex-math><![CDATA[$2268\times 1512$]]></tex-math></alternatives></inline-formula>) images. This corresponds to 107264 unique image blocks for training.</p>
<p>Adam optimizer (Kingma and Ba, <xref ref-type="bibr" rid="j_infor421_ref_023">2015</xref>) (<inline-formula id="j_infor421_ineq_078"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">β</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0.9</mml:mn></mml:math>
<tex-math><![CDATA[${\beta _{1}}=0.9$]]></tex-math></alternatives></inline-formula>, <inline-formula id="j_infor421_ineq_079"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">β</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0.999</mml:mn></mml:math>
<tex-math><![CDATA[${\beta _{2}}=0.999$]]></tex-math></alternatives></inline-formula>, <inline-formula id="j_infor421_ineq_080"><alternatives>
<mml:math><mml:mi mathvariant="italic">ϵ</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:msup><mml:mrow><mml:mtext>e</mml:mtext></mml:mrow><mml:mrow><mml:mo>−</mml:mo><mml:mn>8</mml:mn></mml:mrow></mml:msup></mml:math>
<tex-math><![CDATA[$\epsilon =1{\text{e}^{-8}}$]]></tex-math></alternatives></inline-formula>) is used for the network training. The learning rate for the loss function <inline-formula id="j_infor421_ineq_081"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mathcal{L}_{1}}$]]></tex-math></alternatives></inline-formula> is set to 0.001 and the learning rate for the loss function <inline-formula id="j_infor421_ineq_082"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\mathcal{L}_{2}}$]]></tex-math></alternatives></inline-formula> is set to 0.0001. The number of epochs in the training stage is set to 256. The training was performed on an Intel i7-4770K@3.50 GHz computer with NVIDIA GeForce GTX780 (GK110) graphic card.</p>
<p>We perform series of experiments to corroborate previous discussions and observations. In order to achieve a fair comparison framework, a set of 11 images (<italic>Monarch, Fingerprint, Flintstones, House, Parrot, Barbara, Boats, Cameraman, Foreman, Lena, Peppers</italic> – see TestDataset), which were used in the evaluation of the competitive methods are used for testing purposes with four different measurement sub-rates <inline-formula id="j_infor421_ineq_083"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo>=</mml:mo><mml:mstyle displaystyle="false"><mml:mfrac><mml:mrow><mml:mi mathvariant="italic">M</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">N</mml:mi></mml:mrow></mml:mfrac></mml:mstyle></mml:math>
<tex-math><![CDATA[$r=\frac{M}{N}$]]></tex-math></alternatives></inline-formula>, where <inline-formula id="j_infor421_ineq_084"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo stretchy="false">∈</mml:mo><mml:mo fence="true" stretchy="false">{</mml:mo><mml:mn>0.25</mml:mn><mml:mo mathvariant="normal">,</mml:mo><mml:mn>0.1</mml:mn><mml:mo mathvariant="normal">,</mml:mo><mml:mn>0.04</mml:mn><mml:mo mathvariant="normal">,</mml:mo><mml:mn>0.01</mml:mn><mml:mo fence="true" stretchy="false">}</mml:mo></mml:math>
<tex-math><![CDATA[$r\in \{0.25,0.1,0.04,0.01\}$]]></tex-math></alternatives></inline-formula>. In our experiments, block size of <inline-formula id="j_infor421_ineq_085"><alternatives>
<mml:math><mml:mn>32</mml:mn><mml:mo>×</mml:mo><mml:mn>32</mml:mn></mml:math>
<tex-math><![CDATA[$32\times 32$]]></tex-math></alternatives></inline-formula> is used.</p>
</sec>
<sec id="j_infor421_s_012">
<label>4.2</label>
<title>Measurement Matrix</title>
<p>In Section <xref rid="j_infor421_s_003">3</xref>, we have discussed the connection between the measurement matrix learned by the linear encoder and the one obtained by performing the PCA analysis. In addition, we proposed an efficient initialization method for the network weights. In this section, we perform an experiment to show that the performance of the trained linear autoencoder is limited by the performance of the PCA method network in terms of image reconstruction quality.</p>
<table-wrap id="j_infor421_tab_001">
<label>Table 1</label>
<caption>
<p>Comparison of linear autoencoder and PCA in terms of reconstruction PSNR [dB].</p>
</caption>
<table>
<thead>
<tr>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">PSNR [dB]</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"><inline-formula id="j_infor421_ineq_086"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo>=</mml:mo><mml:mn>0.25</mml:mn></mml:math>
<tex-math><![CDATA[$r=0.25$]]></tex-math></alternatives></inline-formula></td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"><inline-formula id="j_infor421_ineq_087"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo>=</mml:mo><mml:mn>0.10</mml:mn></mml:math>
<tex-math><![CDATA[$r=0.10$]]></tex-math></alternatives></inline-formula></td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"><inline-formula id="j_infor421_ineq_088"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo>=</mml:mo><mml:mn>0.04</mml:mn></mml:math>
<tex-math><![CDATA[$r=0.04$]]></tex-math></alternatives></inline-formula></td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"><inline-formula id="j_infor421_ineq_089"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo>=</mml:mo><mml:mn>0.01</mml:mn></mml:math>
<tex-math><![CDATA[$r=0.01$]]></tex-math></alternatives></inline-formula></td>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: top; text-align: left">PCA</td>
<td style="vertical-align: top; text-align: left">31.45</td>
<td style="vertical-align: top; text-align: left">27.11</td>
<td style="vertical-align: top; text-align: left">23.95</td>
<td style="vertical-align: top; text-align: left">20.56</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">Linear autoencoder</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">31.39</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">27.06</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">23.92</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">20.55</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Table <xref rid="j_infor421_tab_001">1</xref> shows the mean reconstruction results in terms of PSNR for the standard test images. Notice that the reconstruction results are comparable. The slightly lower reconstruction performance of the linear encoder is due to the network not fully converging to the global minimum. Reconstruction results obtained by using the PCA method represent an upper boundary for the performance of the linear autoencoder network for CS image reconstruction.</p>
<p>In Fig. <xref rid="j_infor421_fig_007">7</xref>, reconstruction results obtained using random Gaussian and adaptive measurement matrix for <italic>Parrot</italic> test image are shown. The reconstructions are presented for measurement rates <inline-formula id="j_infor421_ineq_090"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo>=</mml:mo><mml:mo fence="true" stretchy="false">{</mml:mo><mml:mn>0.01</mml:mn><mml:mo mathvariant="normal">,</mml:mo><mml:mn>0.25</mml:mn><mml:mo fence="true" stretchy="false">}</mml:mo></mml:math>
<tex-math><![CDATA[$r=\{0.01,0.25\}$]]></tex-math></alternatives></inline-formula>. Notice that the adaptive measurement matrix preserves more information compared to the random Gaussian matrix.</p>
<fig id="j_infor421_fig_007">
<label>Fig. 7</label>
<caption>
<p>Reconstruction results obtained using linear autoencoder for “<italic>Parrot</italic>” test image (1) and for two measurement ratios <inline-formula id="j_infor421_ineq_091"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo>=</mml:mo><mml:mn>0.01</mml:mn></mml:math>
<tex-math><![CDATA[$r=0.01$]]></tex-math></alternatives></inline-formula> (2, 3) and <inline-formula id="j_infor421_ineq_092"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo>=</mml:mo><mml:mn>0.25</mml:mn></mml:math>
<tex-math><![CDATA[$r=0.25$]]></tex-math></alternatives></inline-formula> (4, 5). Reconstructions labelled with (2) and (4) are obtained using the random Gaussian measurement matrix, while (3) and (5) are obtained using the adaptive measurement matrix.</p>
</caption>
<graphic xlink:href="infor421_g007.jpg"/>
</fig>
</sec>
<sec id="j_infor421_s_013">
<label>4.3</label>
<title>Comparison to Other Methods</title>
<p>In this section, we compare the proposed CS model to other state-of-the-art learning-based CS methods. To provide a fair comparison, we compare our method only to similar methods which use an adaptive linear encoding part.</p>
<p>We compare our method to the ImpReconNet (Lohit <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_026">2018</xref>), Adp-Rec (Xie <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_041">2017</xref>), FCMN (Du <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_013">2019</xref>) and two variants of PCS (Du <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_012">2018</xref>), namely <inline-formula id="j_infor421_ineq_093"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mtext mathvariant="italic">PCS</mml:mtext></mml:mrow><mml:mrow><mml:mtext mathvariant="italic">conv22</mml:mtext></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\textit{PCS}_{\textit{conv22}}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_infor421_ineq_094"><alternatives>
<mml:math><mml:mi mathvariant="italic">P</mml:mi><mml:mi mathvariant="italic">C</mml:mi><mml:msub><mml:mrow><mml:mi mathvariant="italic">S</mml:mi></mml:mrow><mml:mrow><mml:mtext mathvariant="italic">conv34</mml:mtext></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[$PC{S_{\textit{conv34}}}$]]></tex-math></alternatives></inline-formula>. In Table <xref rid="j_infor421_tab_002">2</xref>, mean PSNR reconstruction results (on the same test dataset) for the proposed method and for the competitive methods are shown. ImpReconNet (Euc) denotes a variant of a ReconNet model that uses Euclidean loss function for the network training, while the ImpReconNet (Euc+Adv) denotes a variant which uses a combination of Euclidean and adversarial loss. The competitive PNSR values are shown as reported in the original papers or reproduced using the available algorithms and models. In Fig. <xref rid="j_infor421_fig_008">8</xref>, “<italic>Fingerprint</italic>” test image reconstructions are shown compared to the ground-truth.</p>
<table-wrap id="j_infor421_tab_002">
<label>Table 2</label>
<caption>
<p>Reconstruction results obtained using the learned measurement matrix. Table contains mean PSNR reconstruction results for the standard test images at different measurement rates <italic>r</italic>. Although, FCMN achieves better results in terms of PSNR, it is clearly visible from Fig. <xref rid="j_infor421_fig_008">8</xref> that it does not preserve structural information. This is due to the fact that PSNR measures image quality on per pixel basis, which is not a relevant measure for the preservation of high-level image features.</p>
</caption>
<table>
<thead>
<tr>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Mean PSNR [dB] for different methods</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"><inline-formula id="j_infor421_ineq_095"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo>=</mml:mo><mml:mn>0.25</mml:mn></mml:math>
<tex-math><![CDATA[$r=0.25$]]></tex-math></alternatives></inline-formula></td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"><inline-formula id="j_infor421_ineq_096"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo>=</mml:mo><mml:mn>0.10</mml:mn></mml:math>
<tex-math><![CDATA[$r=0.10$]]></tex-math></alternatives></inline-formula></td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"><inline-formula id="j_infor421_ineq_097"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo>=</mml:mo><mml:mn>0.04</mml:mn></mml:math>
<tex-math><![CDATA[$r=0.04$]]></tex-math></alternatives></inline-formula></td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"><inline-formula id="j_infor421_ineq_098"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo>=</mml:mo><mml:mn>0.01</mml:mn></mml:math>
<tex-math><![CDATA[$r=0.01$]]></tex-math></alternatives></inline-formula></td>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: top; text-align: left">ImpReconNet (Euc) (Lohit <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_026">2018</xref>)</td>
<td style="vertical-align: top; text-align: left">26.59</td>
<td style="vertical-align: top; text-align: left">25.51</td>
<td style="vertical-align: top; text-align: left">23.14</td>
<td style="vertical-align: top; text-align: left">19.44</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">ImpReconNet (Euc + Adv) (Lohit <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_026">2018</xref>)</td>
<td style="vertical-align: top; text-align: left">30.53</td>
<td style="vertical-align: top; text-align: left">26.47</td>
<td style="vertical-align: top; text-align: left">22.98</td>
<td style="vertical-align: top; text-align: left">19.06</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">Adp-Rec (Xie <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_041">2017</xref>)</td>
<td style="vertical-align: top; text-align: left">30.80</td>
<td style="vertical-align: top; text-align: left">27.53</td>
<td style="vertical-align: top; text-align: left">–</td>
<td style="vertical-align: top; text-align: left">20.33</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">FCMN (Du <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_013">2019</xref>)</td>
<td style="vertical-align: top; text-align: left">32.67</td>
<td style="vertical-align: top; text-align: left">28.30</td>
<td style="vertical-align: top; text-align: left">23.87</td>
<td style="vertical-align: top; text-align: left">21.27</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left"><inline-formula id="j_infor421_ineq_099"><alternatives>
<mml:math><mml:mi mathvariant="italic">P</mml:mi><mml:mi mathvariant="italic">C</mml:mi><mml:msub><mml:mrow><mml:mi mathvariant="italic">S</mml:mi></mml:mrow><mml:mrow><mml:mtext mathvariant="italic">conv22</mml:mtext></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[$PC{S_{\textit{conv22}}}$]]></tex-math></alternatives></inline-formula> (Du <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_012">2018</xref>)</td>
<td style="vertical-align: top; text-align: left">–</td>
<td style="vertical-align: top; text-align: left">–</td>
<td style="vertical-align: top; text-align: left">19.38</td>
<td style="vertical-align: top; text-align: left">18.30</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left"><inline-formula id="j_infor421_ineq_100"><alternatives>
<mml:math><mml:mi mathvariant="italic">P</mml:mi><mml:mi mathvariant="italic">C</mml:mi><mml:msub><mml:mrow><mml:mi mathvariant="italic">S</mml:mi></mml:mrow><mml:mrow><mml:mtext mathvariant="italic">conv34</mml:mtext></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[$PC{S_{\textit{conv34}}}$]]></tex-math></alternatives></inline-formula> (Du <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor421_ref_012">2018</xref>)</td>
<td style="vertical-align: top; text-align: left">–</td>
<td style="vertical-align: top; text-align: left">–</td>
<td style="vertical-align: top; text-align: left">16.72</td>
<td style="vertical-align: top; text-align: left">16.80</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin"><bold>Proposed method</bold></td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">32.00</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">26.36</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">23.67</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">20.51</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="j_infor421_fig_008">
<label>Fig. 8</label>
<caption>
<p>Reconstruction results for “<italic>Fingerprint</italic>” test image and for measurement rate <inline-formula id="j_infor421_ineq_101"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo>=</mml:mo><mml:mn>0.04</mml:mn></mml:math>
<tex-math><![CDATA[$r=0.04$]]></tex-math></alternatives></inline-formula>: (1) original, (2) ImpReconNet (Euc + Adv), <inline-formula id="j_infor421_ineq_102"><alternatives>
<mml:math><mml:mtext mathvariant="italic">PSNR</mml:mtext><mml:mo>=</mml:mo><mml:mn>16.97</mml:mn></mml:math>
<tex-math><![CDATA[$\textit{PSNR}=16.97$]]></tex-math></alternatives></inline-formula> dB, (3) FCMN, <inline-formula id="j_infor421_ineq_103"><alternatives>
<mml:math><mml:mtext mathvariant="italic">PSNR</mml:mtext><mml:mo>=</mml:mo><mml:mn>19.05</mml:mn></mml:math>
<tex-math><![CDATA[$\textit{PSNR}=19.05$]]></tex-math></alternatives></inline-formula> dB, (4) <inline-formula id="j_infor421_ineq_104"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mtext mathvariant="italic">PCS</mml:mtext></mml:mrow><mml:mrow><mml:mtext mathvariant="italic">conv</mml:mtext><mml:mn>22</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\textit{PCS}_{\textit{conv}22}}$]]></tex-math></alternatives></inline-formula>, <inline-formula id="j_infor421_ineq_105"><alternatives>
<mml:math><mml:mtext mathvariant="italic">PSNR</mml:mtext><mml:mo>=</mml:mo><mml:mn>14.83</mml:mn></mml:math>
<tex-math><![CDATA[$\textit{PSNR}=14.83$]]></tex-math></alternatives></inline-formula> dB, (5) <inline-formula id="j_infor421_ineq_106"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mtext mathvariant="italic">PCS</mml:mtext></mml:mrow><mml:mrow><mml:mtext mathvariant="italic">conv</mml:mtext><mml:mn>34</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\textit{PCS}_{\textit{conv}34}}$]]></tex-math></alternatives></inline-formula>, <inline-formula id="j_infor421_ineq_107"><alternatives>
<mml:math><mml:mtext mathvariant="italic">PSNR</mml:mtext><mml:mo>=</mml:mo><mml:mn>14.35</mml:mn></mml:math>
<tex-math><![CDATA[$\textit{PSNR}=14.35$]]></tex-math></alternatives></inline-formula> dB, (6) proposed method, <inline-formula id="j_infor421_ineq_108"><alternatives>
<mml:math><mml:mtext mathvariant="italic">PSNR</mml:mtext><mml:mo>=</mml:mo><mml:mn>20.31</mml:mn></mml:math>
<tex-math><![CDATA[$\textit{PSNR}=20.31$]]></tex-math></alternatives></inline-formula> dB. Our method results in better structure preservation compared to the ImpReconNet and FCMN methods, while we achieve significantly higher PSNR compared to the <italic>PCS</italic> methods by a margin of around 5 dB in PSNR.</p>
</caption>
<graphic xlink:href="infor421_g008.jpg"/>
</fig>
<p>On one hand, FCMN and ImpReconNet yield similar results in terms of PSNR compared to our method (see Table <xref rid="j_infor421_tab_002">2</xref>), while on the other hand the aforementioned methods do not preserve structural and high level semantic information. The two PCS methods preserve structural information, but yield images that contain significant amount of noise when observed on pixel-wise level. Our method benefits from the combination of pixel-wise Euclidean loss in image space and the Euclidean loss in the latent space of the <inline-formula id="j_infor421_ineq_109"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mtext mathvariant="italic">VGG</mml:mtext></mml:mrow><mml:mrow><mml:mn>19</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\textit{VGG}_{19}}$]]></tex-math></alternatives></inline-formula> network resulting in high pixel-wise accuracy as well as good preservation of structural information. Similar observation holds for the “<italic>Monarch</italic>” reconstructions in Fig. <xref rid="j_infor421_fig_009">9</xref> where a comparison between the competitive perceptual CS methods and the proposed method at extremely low measurement rate <inline-formula id="j_infor421_ineq_110"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo>=</mml:mo><mml:mn>0.01</mml:mn></mml:math>
<tex-math><![CDATA[$r=0.01$]]></tex-math></alternatives></inline-formula> is presented. Notice the high level of noise in the PCS reconstructions compared to the reconstruction obtained using the proposed method.</p>
<fig id="j_infor421_fig_009">
<label>Fig. 9</label>
<caption>
<p>Reconstruction results for “<italic>Monarch</italic>” test image and for measurement rate <inline-formula id="j_infor421_ineq_111"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo>=</mml:mo><mml:mn>0.01</mml:mn></mml:math>
<tex-math><![CDATA[$r=0.01$]]></tex-math></alternatives></inline-formula>: (1) original, (2) <inline-formula id="j_infor421_ineq_112"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mtext mathvariant="italic">PCS</mml:mtext></mml:mrow><mml:mrow><mml:mtext mathvariant="italic">conv</mml:mtext><mml:mn>22</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\textit{PCS}_{\textit{conv}22}}$]]></tex-math></alternatives></inline-formula>, <inline-formula id="j_infor421_ineq_113"><alternatives>
<mml:math><mml:mtext mathvariant="italic">PSNR</mml:mtext><mml:mo>=</mml:mo><mml:mn>16.28</mml:mn></mml:math>
<tex-math><![CDATA[$\textit{PSNR}=16.28$]]></tex-math></alternatives></inline-formula> dB, (3) <inline-formula id="j_infor421_ineq_114"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mtext mathvariant="italic">PCS</mml:mtext></mml:mrow><mml:mrow><mml:mtext mathvariant="italic">conv</mml:mtext><mml:mn>34</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\textit{PCS}_{\textit{conv}34}}$]]></tex-math></alternatives></inline-formula>, <inline-formula id="j_infor421_ineq_115"><alternatives>
<mml:math><mml:mtext mathvariant="italic">PSNR</mml:mtext><mml:mo>=</mml:mo><mml:mn>14.87</mml:mn></mml:math>
<tex-math><![CDATA[$\textit{PSNR}=14.87$]]></tex-math></alternatives></inline-formula> dB, (4) proposed method, <inline-formula id="j_infor421_ineq_116"><alternatives>
<mml:math><mml:mtext mathvariant="italic">PSNR</mml:mtext><mml:mo>=</mml:mo><mml:mn>18.04</mml:mn></mml:math>
<tex-math><![CDATA[$\textit{PSNR}=18.04$]]></tex-math></alternatives></inline-formula> dB. Although PCS method successfully reconstructs higher semantic information, it suffers from significant amount of noise. Contrary, our method reconstructs the same amount of information with less noise and visual artifacts.</p>
</caption>
<graphic xlink:href="infor421_g009.jpg"/>
</fig>
</sec>
</sec>
<sec id="j_infor421_s_014">
<label>5</label>
<title>Discussion</title>
<p>Iterative nature and high computational complexity present the main drawbacks of the traditional CS reconstruction algorithms. Learning based methods for the CS image reconstruction present an efficient alternative to the traditional approach. Average per-image reconstruction time for a set of images with size <inline-formula id="j_infor421_ineq_117"><alternatives>
<mml:math><mml:mn>512</mml:mn><mml:mo>×</mml:mo><mml:mn>512</mml:mn></mml:math>
<tex-math><![CDATA[$512\times 512$]]></tex-math></alternatives></inline-formula> using traditional <inline-formula id="j_infor421_ineq_118"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mi mathvariant="italic">l</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${l_{1}}$]]></tex-math></alternatives></inline-formula> reconstruction method from the Sparse Modelling Software (SPAMS, <xref ref-type="bibr" rid="j_infor421_ref_037">2010</xref>) optimization toolbox and a block-based approach with a subsampling rate of <inline-formula id="j_infor421_ineq_119"><alternatives>
<mml:math><mml:mi mathvariant="italic">r</mml:mi><mml:mo>=</mml:mo><mml:mn>0.04</mml:mn></mml:math>
<tex-math><![CDATA[$r=0.04$]]></tex-math></alternatives></inline-formula> is around 0.6 s, while the learning based method reduces the reconstruction time to around 0.025 s. An example of a real-world application of the learning-based approach is (Ralašić and Seršić, <xref ref-type="bibr" rid="j_infor421_ref_034">2019</xref>), where the authors propose a real-time motion detection system in CS video which operates at extremely low measurement rates.</p>
<p>Better performance of the learning based methods in the reconstruction phase comes at an increased cost in the training phase. In order to learn the optimal measurement and reconstruction operators, learning based methods require an offline training procedure with a relatively large training dataset. Since learning based methods are data driven, they are also data dependent. Thus, if the statistical distribution of the training dataset significantly differs from the testing data, the performance of the learning based methods will be influenced. Finally, convolutional block image processing is not applicable in imaging modalities where the measurements correspond to the whole signal, and one cannot divide the signal into smaller blocks.</p>
</sec>
<sec id="j_infor421_s_015">
<label>6</label>
<title>Conclusion</title>
<p>In this paper, we proposed a convolutional autoencoder architecture for the image compressive sensing reconstruction, which represents a non-iterative and extremely fast alternative to the traditional sparse optimization algorithms. In contrast with other learning based methods, we designed a measurement process which enables the model to be trained on normalized, mean-centred measurements which results in a significant speedup of the neural network convergence. Moreover, we proposed an efficient initialization method for the autoencoder network weights based on the connection between the learning-based CS approach and the principal component analysis. The residual learning network was used to further improve the initial reconstruction obtained by the autoencoder.</p>
<p>A combination of a pixel-wise Euclidean loss function for the autoencoder network training along with a Euclidean loss function in the latent space of the <inline-formula id="j_infor421_ineq_120"><alternatives>
<mml:math><mml:msub><mml:mrow><mml:mtext mathvariant="italic">VGG</mml:mtext></mml:mrow><mml:mrow><mml:mn>19</mml:mn></mml:mrow></mml:msub></mml:math>
<tex-math><![CDATA[${\textit{VGG}_{19}}$]]></tex-math></alternatives></inline-formula> network for the residual network training was proposed. It results in image reconstructions with higher pixel-wise reconstruction accuracy and more semantic information preserved at low measurement rates. In our future work, we will explore different loss functions that correspond to the notion of the perceptual loss.</p>
</sec>
</body>
<back>
<ref-list id="j_infor421_reflist_001">
<title>References</title>
<ref id="j_infor421_ref_001">
<mixed-citation publication-type="other"><string-name><surname>Abadi</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Agarwal</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Barham</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Brevdo</surname>, <given-names>E.</given-names></string-name>, <string-name><surname>Chen</surname>, <given-names>Z.</given-names></string-name>, <string-name><surname>Citro</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Corrado</surname>, <given-names>G.S.</given-names></string-name>, <string-name><surname>Davis</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Dean</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Devin</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Ghemawat I</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Goodfellow</surname></string-name>, <string-name><surname>Harp</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Irving</surname>, <given-names>G.</given-names></string-name>, <string-name><surname>Isard</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Jia</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Jozefowicz</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Kaiser</surname>, <given-names>L.</given-names></string-name>, <string-name><surname>Kudlur</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Levenberg</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Mané</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Monga</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Moore</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Murray</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Olah</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Schuster</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Shlens</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Steiner</surname>, <given-names>B.</given-names></string-name>, <string-name><surname>Sutskever</surname>, <given-names>I.</given-names></string-name>, <string-name><surname>Talwar</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Tucker</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Vanhoucke</surname>, <given-names>V.</given-names></string-name>, <string-name><surname>Vasudevan</surname>, <given-names>V.</given-names></string-name>, <string-name><surname>Viégas</surname>, <given-names>F.</given-names></string-name>, <string-name><surname>Vinyals</surname>, <given-names>O.</given-names></string-name>, <string-name><surname>Warden</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Wattenberg</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Wicke</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Yu</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Zheng</surname>, <given-names>X.</given-names></string-name> (2015). <italic>TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems</italic>. Online: <uri>https://www.tensorflow.org/</uri>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_002">
<mixed-citation publication-type="journal"><string-name><surname>Baldi</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Hornik</surname>, <given-names>K.</given-names></string-name> (<year>1989</year>). <article-title>Neural networks and principal component analysis: learning from examples without local minima</article-title>. <source>Neural Networks</source>, <volume>2</volume>, <fpage>53</fpage>–<lpage>58</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_003">
<mixed-citation publication-type="journal"><string-name><surname>Baraniuk</surname>, <given-names>R.G.</given-names></string-name> (<year>2007</year>). <article-title>Compressive sensing [lecture notes]</article-title>. <source>IEEE Signal Processing Magazine</source>, <volume>24</volume>, <fpage>118</fpage>–<lpage>121</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_004">
<mixed-citation publication-type="journal"><string-name><surname>Beck</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Teboulle</surname>, <given-names>M.</given-names></string-name> (<year>2009</year>). <article-title>A fast iterative shrinkage-thresholding algorithm for linear inverse problems</article-title>. <source>SIAM Journal on Imaging Sciences</source>, <volume>2</volume>, <fpage>183</fpage>–<lpage>202</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_005">
<mixed-citation publication-type="journal"><string-name><surname>Becker</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Bobin</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Candès</surname>, <given-names>E.J.</given-names></string-name> (<year>2011</year>). <article-title>NESTA: a fast and accurate first-order method for sparse recovery</article-title>. <source>SIAM Journal on Imaging Sciences</source>, <volume>4</volume>, <fpage>1</fpage>–<lpage>39</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_006">
<mixed-citation publication-type="chapter"><string-name><surname>Bertalmio</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Sapiro</surname>, <given-names>G.</given-names></string-name>, <string-name><surname>Caselles</surname>, <given-names>V.</given-names></string-name>, <string-name><surname>Ballester</surname>, <given-names>C.</given-names></string-name> (<year>2000</year>). <chapter-title>Image inpainting</chapter-title>. In: <source>Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniqus</source>. <publisher-name>ACM Press/Addison-Wesley Publishing Co.</publisher-name>, pp. <fpage>417</fpage>–<lpage>424</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_007">
<mixed-citation publication-type="journal"><string-name><surname>Bugeau</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Bertalmio</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Caselles</surname>, <given-names>V.</given-names></string-name>, <string-name><surname>Sapiro</surname>, <given-names>G.</given-names></string-name> (<year>2010</year>). <article-title>A comprehensive framework for image inpainting</article-title>. <source>IEEE Transactions on Image Processing</source>, <volume>19</volume>, <fpage>2634</fpage>–<lpage>2645</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_008">
<mixed-citation publication-type="journal"><string-name><surname>Candes</surname>, <given-names>E.J.</given-names></string-name>, <string-name><surname>Tao</surname>, <given-names>T.</given-names></string-name> (<year>2006</year>). <article-title>Near-optimal signal recovery from random projections: universal encoding strategies?</article-title> <source>IEEE Transactions on Information Theory</source>, <volume>52</volume>, <fpage>5406</fpage>–<lpage>5425</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_009">
<mixed-citation publication-type="chapter"><string-name><surname>Dabov</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Foi</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Katkovnik</surname>, <given-names>V.</given-names></string-name>, <string-name><surname>Egiazarian</surname>, <given-names>K.</given-names></string-name> (<year>2009</year>). <chapter-title>BM3D image denoising with shape-adaptive principal component analysis</chapter-title>. In: <source>SPARS’09-Signal Processing with Adaptive Sparse Structured Representations</source>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_010">
<mixed-citation publication-type="journal"><string-name><surname>Dong</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Loy</surname>, <given-names>C.C.</given-names></string-name>, <string-name><surname>He</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Tang</surname>, <given-names>X.</given-names></string-name> (<year>2016</year>). <article-title>Image super-resolution using deep convolutional networks</article-title>. <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>, <volume>38</volume>, <fpage>295</fpage>–<lpage>307</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_011">
<mixed-citation publication-type="journal"><string-name><surname>Du</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Xie</surname>, <given-names>X.</given-names></string-name>, <string-name><surname>Wang</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Shi</surname>, <given-names>G.</given-names></string-name> (<year>2012</year>). <article-title>Block-based compressed sensing of images and video</article-title>. <source>Foundations and Trends in Signal Processing</source>, <volume>4</volume>, <fpage>297</fpage>–<lpage>416</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_012">
<mixed-citation publication-type="journal"><string-name><surname>Du</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Xie</surname>, <given-names>X.</given-names></string-name>, <string-name><surname>Wang</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Shi</surname>, <given-names>G.</given-names></string-name> (<year>2018</year>). <article-title>Perceptual compressive sensing</article-title>. <source>Elsevier Neurocomputing</source>, <volume>328</volume>, <fpage>105</fpage>–<lpage>112</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_013">
<mixed-citation publication-type="journal"><string-name><surname>Du</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Xie</surname>, <given-names>X.</given-names></string-name>, <string-name><surname>Wang</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Shi</surname>, <given-names>G.</given-names></string-name>, <string-name><surname>Xu</surname>, <given-names>X.</given-names></string-name>, <string-name><surname>Wang</surname>, <given-names>Y.</given-names></string-name> (<year>2019</year>). <article-title>Fully convolutional measurement network for compressive sensing image reconstruction</article-title>. <source>Elsevier Neurocomputing</source>, <volume>328</volume>, <fpage>105</fpage>–<lpage>112</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_014">
<mixed-citation publication-type="journal"><string-name><surname>Duarte</surname>, <given-names>M.F.</given-names></string-name>, <string-name><surname>Eldar</surname>, <given-names>Y.C.</given-names></string-name> (<year>2011</year>). <article-title>Structured compressed sensing: from theory to applications</article-title>. <source>IEEE Transactions on Signal Processing</source>, <volume>59</volume>, <fpage>4053</fpage>–<lpage>4085</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_015">
<mixed-citation publication-type="journal"><string-name><surname>Duarte</surname>, <given-names>M.F.</given-names></string-name>, <string-name><surname>Baraniuk</surname>, <given-names>R.G.</given-names></string-name> (<year>2012</year>). <article-title>Kronecker compressive sensing</article-title>. <source>IEEE Transactions on Image Processing</source>, <volume>21</volume>, <fpage>494</fpage>–<lpage>504</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_016">
<mixed-citation publication-type="other"><string-name><surname>Dumoulin</surname>, <given-names>V.</given-names></string-name>, <string-name><surname>Visin</surname>, <given-names>F.</given-names></string-name> (2016). <italic>A guide to convolution arithmetic for deep learning</italic>. ArXiv preprint <ext-link ext-link-type="uri" xlink:href="http://arxiv.org/abs/arXiv:1603.07285">arXiv:1603.07285</ext-link>, pp. 1–13.</mixed-citation>
</ref>
<ref id="j_infor421_ref_017">
<mixed-citation publication-type="journal"><string-name><surname>Elad</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Aharon</surname>, <given-names>M.</given-names></string-name> (<year>2006</year>). <article-title>Image denoising via sparse and redundant representations over learned dictionaries</article-title>. <source>IEEE Transactions on Image Processing</source>, <volume>15</volume>, <fpage>3736</fpage>–<lpage>3745</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_018">
<mixed-citation publication-type="other"><string-name><surname>Hantao</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Feng</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Shiliang</surname>, <given-names>Z.</given-names></string-name>, <string-name><surname>Yongdong</surname>, <given-names>Z.</given-names></string-name>, <string-name><surname>Tian</surname>, <given-names>Q.</given-names></string-name>, <string-name><surname>Xu</surname>, <given-names>C.</given-names></string-name> (2019). DR2-Net: Deep Residual Reconstruction Network for Image Compressive Sensing. <italic>Neurocomputing</italic>. <ext-link ext-link-type="uri" xlink:href="http://arxiv.org/abs/abs/1702.05743">abs/1702.05743</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_019">
<mixed-citation publication-type="other"><string-name><surname>He</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Zhang</surname>, <given-names>X.</given-names></string-name>, <string-name><surname>Ren</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Sun</surname>, <given-names>J.</given-names></string-name> (2015). Deep residual learning for image recognition. <italic>Multimedia Tools and Applications</italic>, 1–17. <ext-link ext-link-type="uri" xlink:href="http://arxiv.org/abs/arXiv:1512.03385">arXiv:1512.03385</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_020">
<mixed-citation publication-type="chapter"><string-name><surname>Ioffe</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Szegedy</surname>, <given-names>C.</given-names></string-name> (<year>2015</year>). <chapter-title>Batch normalization: accelerating deep network training by reducing internal covariate shift</chapter-title>. In: <source>International Conference on Machine Learning, 2015</source>, pp. <fpage>448</fpage>–<lpage>456</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_021">
<mixed-citation publication-type="chapter"><string-name><surname>Johnson</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Alahi</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Fei-Fei</surname>, <given-names>L.</given-names></string-name> (<year>2016</year>). <chapter-title>Perceptual losses for real-time style transfer and super-resolution</chapter-title>. In: <source>Springer Chinese Conference on Pattern Recognition and Computer Vision (PRCV), 2018</source>, pp. <fpage>268</fpage>–<lpage>279</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_022">
<mixed-citation publication-type="other"><string-name><surname>Karpathy</surname>, <given-names>A.</given-names></string-name> (2017). CS231n: Convolutional Neural Networks for Visual Recognition, Spring 2017. Online: <uri>http://cs231n.github.io/neural-networks-1/</uri>, accessed: June 2019.</mixed-citation>
</ref>
<ref id="j_infor421_ref_023">
<mixed-citation publication-type="chapter"><string-name><surname>Kingma</surname>, <given-names>D.P.</given-names></string-name>, <string-name><surname>Ba</surname>, <given-names>J.</given-names></string-name> (<year>2015</year>). <chapter-title>Adam: a method for stochastic optimization</chapter-title>. In: <source>Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015)</source>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_024">
<mixed-citation publication-type="other"><string-name><surname>Kristiadi</surname>, <given-names>A.</given-names></string-name> (2019). Why does L2 reconstruction loss yield blurry images? Online: <uri>https://wiseodd.github.io/techblog/2017/02/09/why-l2-blurry/</uri>, accessed: June 2019.</mixed-citation>
</ref>
<ref id="j_infor421_ref_025">
<mixed-citation publication-type="chapter"><string-name><surname>Kulkarni</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Lohit</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Turaga</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Kerviche</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Ashok</surname>, <given-names>A.</given-names></string-name> (<year>2016</year>). <chapter-title>ReconNet: non-iterative reconstruction of images from compressively sensed measurements</chapter-title>. In: <source>IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</source>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_026">
<mixed-citation publication-type="journal"><string-name><surname>Lohit</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Kulkarni</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Kerviche</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Turaga</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Ashok</surname>, <given-names>A.</given-names></string-name> (<year>2018</year>). <article-title>Convolutional neural networks for non-iterative reconstruction of compressively sensed images</article-title>. <source>IEEE Transactions on Computational Imaging</source>, <volume>4</volume>, <fpage>326</fpage>–<lpage>340</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_027">
<mixed-citation publication-type="journal"><string-name><surname>Mallat</surname>, <given-names>S.G.</given-names></string-name>, <string-name><surname>Zhifeng</surname>, <given-names>Z.</given-names></string-name> (<year>2006</year>). <article-title>Matching pursuits with time-frequency dictionaries</article-title>. <source>IEEE Transactions on Signal Processing</source>, <volume>41</volume>, <fpage>3397</fpage>–<lpage>3415</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_028">
<mixed-citation publication-type="chapter"><string-name><surname>Mousavi</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Baraniuk</surname>, <given-names>R.G.</given-names></string-name> (<year>2017</year>). <chapter-title>Learning to invert: signal recovery via deep convolutional networks</chapter-title>. In: <source>IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017</source>, pp. <fpage>2272</fpage>–<lpage>2276</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_029">
<mixed-citation publication-type="chapter"><string-name><surname>Mousavi</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Patel</surname>, <given-names>A.B.</given-names></string-name>, <string-name><surname>Baraniuk</surname>, <given-names>R.G.</given-names></string-name> (<year>2015</year>). <chapter-title>A deep learning approach to structured signal recovery</chapter-title>. In: <source>2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton)</source>, pp. <fpage>1336</fpage>–<lpage>1343</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_030">
<mixed-citation publication-type="chapter"><string-name><surname>Mousavi</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Dasarathy</surname>, <given-names>G.</given-names></string-name>, <string-name><surname>Baraniuk</surname>, <given-names>R.G.</given-names></string-name> (<year>2017</year>). <chapter-title>DeepCodec: adaptive sensing and recovery via deep convolutional neural networks</chapter-title>. In: <source>2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton)</source>, pp. <fpage>744</fpage>–<lpage>744</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_031">
<mixed-citation publication-type="journal"><string-name><surname>Needell</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Tropp</surname>, <given-names>J.A.</given-names></string-name> (<year>2009</year>). <article-title>CoSaMP: iterative signal recovery from incomplete and inaccurate samples</article-title>. <source>Elsevier Applied and Computational Harmonic Analysis</source>, <volume>26</volume>, <fpage>301</fpage>–<lpage>321</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_032">
<mixed-citation publication-type="chapter"><string-name><surname>Párraga</surname>, <given-names>C.A.</given-names></string-name>, <string-name><surname>Baldrich</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Vanrell</surname>, <given-names>M.</given-names></string-name> (<year>2010</year>). <chapter-title>Accurate mapping of natural scenes radiance to cone activation space: a new image dataset</chapter-title>. In: <source>Conference on Colour in Graphics, Imaging, and Vision, 2010</source>. <publisher-name>Society for Imaging Science and Technology</publisher-name>, pp. <fpage>50</fpage>–<lpage>57</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_033">
<mixed-citation publication-type="chapter"><string-name><surname>Pati</surname>, <given-names>Y.C.</given-names></string-name>, <string-name><surname>Yagyensh</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Rezaiifar</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Krishnaprasad</surname>, <given-names>P.S.</given-names></string-name> (<year>1993</year>). <chapter-title>Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition</chapter-title>. In: <source>Conference Record of The Twenty-Seventh Asilomar Conference on Signals, Systems and Computers, 1993</source>. <publisher-name>IEEE</publisher-name>, pp. <fpage>40</fpage>–<lpage>44</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_034">
<mixed-citation publication-type="chapter"><string-name><surname>Ralašić</surname>, <given-names>I.</given-names></string-name>, <string-name><surname>Seršić</surname>, <given-names>D.</given-names></string-name> (<year>2019</year>). <chapter-title>Real-time motion detection in extremely subsampled compressive sensing video</chapter-title>. In: <source>2019 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)</source>, pp. <fpage>198</fpage>–<lpage>203</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_035">
<mixed-citation publication-type="journal"><string-name><surname>Ralašić</surname>, <given-names>I.</given-names></string-name>, <string-name><surname>Seršić</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Petrinović</surname>, <given-names>D.</given-names></string-name> (<year>2018</year>). <article-title>Off-the-shelf measurement setup for compressive imaging</article-title>. <source>IEEE Transactions on Instrumentation and Measurement</source>, <volume>68</volume>, <fpage>502</fpage>–<lpage>512</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_036">
<mixed-citation publication-type="other"><string-name><surname>Simonyan</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Zisserman</surname>, <given-names>A.</given-names></string-name> (2014). <italic>Very deep convolutional networks for large-scale image recognition</italic>. arXiv preprint <ext-link ext-link-type="uri" xlink:href="http://arxiv.org/abs/arXiv:1409.1556">arXiv:1409.1556</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_037">
<mixed-citation publication-type="other"><string-name><surname>Sparse Modeling Software – optimization toolbox</surname></string-name> (2010). Online: <uri>http://spams-devel.gforge.inria.fr/index.html</uri>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_038">
<mixed-citation publication-type="chapter"><string-name><surname>Takhar</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Laska</surname>, <given-names>J.N.</given-names></string-name>, <string-name><surname>Wakin</surname>, <given-names>M.B.</given-names></string-name>, <string-name><surname>Duarte</surname>, <given-names>M.F.</given-names></string-name>, <string-name><surname>Baron</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Sarvotham</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Kelly</surname>, <given-names>K.F.</given-names></string-name>, <string-name><surname>Baraniuk</surname>, <given-names>R.G.</given-names></string-name> (<year>2006</year>). <chapter-title>A new compressive imaging camera architecture using optical-domain compression</chapter-title>. In: <source>Computational Imaging IV, 2006</source>. <publisher-name>International Society for Optics and Photonics</publisher-name>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_039">
<mixed-citation publication-type="other">Testing dataset for learning-based compressive sensing reconstruction (2019). Available on: <uri>https://github.com/KuldeepKulkarni/ReconNet/tree/master/test/test_images</uri>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_040">
<mixed-citation publication-type="journal"><string-name><surname>Wright</surname>, <given-names>S.J.</given-names></string-name>, <string-name><surname>Nowak</surname>, <given-names>R.D.</given-names></string-name>, <string-name><surname>Figueiredo</surname>, <given-names>M.A.T.</given-names></string-name> (<year>2009</year>). <article-title>Sparse reconstruction by separable approximation</article-title>. <source>IEEE Transactions on Signal Processing</source>, <volume>57</volume>, <fpage>2479</fpage>–<lpage>2493</lpage>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_041">
<mixed-citation publication-type="chapter"><string-name><surname>Xie</surname>, <given-names>X.</given-names></string-name>, <string-name><surname>Wang</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Shi</surname>, <given-names>G.</given-names></string-name>, <string-name><surname>Wang</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Du</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Han</surname>, <given-names>X.</given-names></string-name> (<year>2017</year>). <chapter-title>Adaptive measurement network for CS image reconstruction</chapter-title>. In: <source>CCF Chinese Conference on Computer Vision 2017</source>. <publisher-name>Springer</publisher-name>.</mixed-citation>
</ref>
<ref id="j_infor421_ref_042">
<mixed-citation publication-type="journal"><string-name><surname>Yang</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Wright</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Huang</surname>, <given-names>T.S.</given-names></string-name>, <string-name><surname>Ma</surname>, <given-names>Y.</given-names></string-name> (<year>2010</year>). <article-title>Image super-resolution via sparse representation</article-title>. <source>IEEE Transactions on Image Processing</source>, <volume>19</volume>, <fpage>2861</fpage>–<lpage>2873</lpage>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>