Graduate Students Presentations

The Implicit Bias of Minima Stability: A View from Function Space (NeurIPS 2021)

Rotem Mulayoff, Tomer Michaeli, Daniel Soudry

The loss terrains of over-parameterized neural networks have multiple global minima. However, it is well known that stochastic gradient descent (SGD) can stably converge only to minima that are sufficiently flat w.r.t. SGD's step size. In this paper we study the effect that this mechanism has on the function implemented by the trained model. First, we extend the existing knowledge on minima stability to non-differentiable minima, which are common in ReLU nets. We then use our stability results to study a single hidden layer univariate ReLU network. In this setting, we show that SGD is biased towards functions whose second derivative (w.r.t the input) has a bounded weighted L1 norm, and this is regardless of the initialization. In particular, we show that the function implemented by the network upon convergence gets smoother as the learning rate increases. The weight multiplying the second derivative is larger around the center of the support of the training distribution, and smaller towards its boundaries, suggesting that a trained model tends to be smoother at the center of the training distribution.

GAN "Steerability" Without Optimization

Nurit Spingarn Eliezer, Ron Banner, Tomer Michaeli

Recent research has shown remarkable success in revealing "steering" directions in the latent spaces of pre-trained GANs. These directions correspond to semantically meaningful image transformations (e.g., shift, zoom, color manipulations), and have the same interpretable effect across all categories that the GAN can generate. Some methods focus on user-specified transformations, while others discover transformations in an unsupervised manner. However, all existing techniques rely on an optimization procedure to expose those directions, and offer no control over the degree of allowed interaction between different transformations. In this paper, we show that "steering" trajectories can be computed in closed form directly from the generator's weights without any form of training or optimization. This applies to user-prescribed geometric transformations, as well as to unsupervised discovery of more complex effects. Our approach allows determining both linear and nonlinear trajectories, and has many advantages over previous methods. In particular, we can control whether one transformation is allowed to come on the expense of another (e.g., zoom-in with or without allowing translation to keep the object centered). Moreover, we can determine the natural end-point of the trajectory, which corresponds to the largest extent to which a transformation can be applied without incurring degradation. Finally, we show how transferring attributes between images can be achieved without optimization, even across different categories.

Catch-A-Waveform: Learning to Generate Audio from a Single Short Example (NeurIPS 2021)

Gal Greshler, Tamar Shaham, Tomer Michaeli

Deep Self-Dissimilarities as Powerful Visual Fingerprints

Idan Kligvasser, Tamar Shaham, Yuval Bahat, Tomer Michaeli

Features extracted from deep layers of classification networks are widely used as image descriptors. Here, we exploit an unexplored property of these features: their internal dissimilarity. While small image patches are known to have similar statistics across image scales, it turns out that the internal distribution of deep features varies distinctively between scales. We show how this deep self dissimilarity (DSD) property can be used as a powerful visual fingerprint. Particularly, we illustrate that full-reference and no-reference image quality measures derived from DSD are highly correlated with human preference. In addition, incorporating DSD as a loss function in training of image restoration networks, leads to results that are at least as photo-realistic as those obtained by GAN based methods, while not requiring adversarial training.

A Theory of the Distortion-Perception Tradeoff in Wasserstein Space

Dror Freirich, Tomer Michaeli, Ron Meir

The lower the distortion of an estimator, the more the distribution of its outputs generally deviates from the distribution of the signals it attempts to estimate. This phenomenon, known as the perception-distortion tradeoff, has captured significant attention in image restoration, where it implies that fidelity to ground truth images comes at the expense of perceptual quality (deviation from statistics of natural images). However, despite the increasing popularity of performing comparisons on the perception-distortion plane, there remains an important open question: what is the minimal distortion that can be achieved under a given perception constraint? In this paper, we derive a closed form expression for this distortion-perception (DP) function for the mean squared-error (MSE) distortion and the Wasserstein-2 perception index. We prove that the DP function is always quadratic, regardless of the underlying distribution. This stems from the fact that estimators on the DP curve form a geodesic in Wasserstein space. In the Gaussian setting, we further provide a closed form expression for such estimators. For general distributions, we show how these estimators can be constructed from the estimators at the two extremes of the tradeoff: The global MSE minimizer, and a minimizer of the MSE under a perfect perceptual quality constraint. The latter can be obtained as a stochastic transformation of the former.

Deep Adaptation Control for Acoustic Echo Cancellation (ICASSP 2022)

Amir Ivry, Israel Cohen, Baruch Berdugo

Machine learning and digital signal processing have been extensively used to enhance speech. However, methods to reduce early reflections in studio settings are usually related to the physical characteristics of the room. In this paper, we address the problem of early acoustic reflections in television studios and control rooms, and propose a two-stage method that exploits the knowledge of a pretrained speech synthesis generator. First, given a degraded speech signal that includes the direct sound and early reflections, a U-Net convolutional neural network is used to attenuate the early reflections in the spectral domain. Then, a pretrained speech synthesis generator reconstructs the phase to predict an enhanced speech signal in the time domain. Qualitative and quantitative experimental results demonstrate excellent studio quality of speech enhancement.

Attenuation of Acoustic Early Reflections in Television Studios Using Pretrained Speech Synthesis Neural Network (ICASSP 2022)

Tomer Rosenbaum, Israel Cohen, Emil Winebrand

Machine learning and digital signal processing have been extensively used to enhance speech. However, methods to reduce early reflections in studio settings are usually related to the physical characteristics of the room. In this paper, we address the problem of early acoustic reflections in television studios and control rooms, and propose a two-stage method that exploits the knowledge of a pretrained speech synthesis generator. First, given a degraded speech signal that includes the direct sound and early reflections, a U-Net convolutional neural network is used to attenuate the early reflections in the spectral domain. Then, a pretrained speech synthesis generator reconstructs the phase to predict an enhanced speech signal in the time domain. Qualitative and quantitative experimental results demonstrate excellent studio quality of speech enhancement.

Robust Differential Beamforming with Rectangular Arrays (EUSIPCO 2021)

Gal Itzhak, Israel Cohen, Jacob Benesty

In this paper, we present a robust approach for rectangular differential beamforming. At first, we propose to employ a 2-D multistage mean which operates independently on the columns and rows of the observation signals of a URA. The number of mean stages along the columns and rows of the URA is controlled by two parameters, Qc and Qr, respectively. Then, the resulting matrix is (column-wise) concatenated to a vector form to serve as a modified version of the observations. Finally, we design a rectangular differential beamformer and apply it to the latter. We show that the first mean operation improves the white noise robustness of the resulting beamformer. We focus on the maximum directivity factor (MDF) and null-constrained maximum directivity factor (NCMDF) differential beamformers and analyze their performances in terms of both the white noise gain (WNG) and directivity factor (DF) measures. We show that the configuration (Qc, Qr) constitutes a useful mean to mitigate the white noise amplification of differential beamformers in low frequencies.

Quadratic Beamforming for Magnitude Estimation (EUSIPCO 2021)

Gal Itzhak, Jacob Benesty, Israel Cohen

In this paper, we introduce an optimal quadratic Wiener beamformer for magnitude estimation of a desired signal. For simplicity, we focus on a two-microphone array and develop an iterative algorithm for magnitude estimation based on a quadratic multichannel noise reduction approach. We analyze two test cases, with uncorrelated and correlated noises. In each, we derive the appropriate versions of the Wiener beamformer, as well as their corresponding unbiased magnitude estimators. We compare the root-mean-squared errors (RMSEs) for the linear and quadratic Wiener beamformers and show that for low input signal-to-noise ratios (SNRs), the RMSE obtained with the proposed approach is either lower than or equal to the RMSE obtained with the linear Wiener beamformer, depending on the type of noise and its distribution.

Optimal Design of Constant Beamwidth Beamformers with Concentric Ring Array (M.Sc. seminar)

Avital Kleiman, Supervised by Prof. Israel Cohen

Applications in communication, medical diagnosis, radar, and speech processing require dealing with wideband signals. Wideband beamforming is done by changing the weights given to each sensor under specified restrictions and by choosing a suitable array geometry. One of the main challenges of designing a wideband beamformer is maintaining a constant beamwidth over a wide range of frequencies. In standard beamforming methods, the mainbeam becomes narrower as the frequency increases, which may distort the output signal. Our work introduces constant-beamwidth wideband beamformers for concentric ring arrays. Specifically, we first present constant-beamwidth beamformers designed with pre-defined ring locations and weighting window functions. Following, we define an optimization problem to choose the optimal weight values. Finally, we present nonuniform constant-beamwidth beamformers with variable ring locations and attenuation values. The design considerations, theoretical analysis, and performance comparisons show the advantages of the circular ring array geometry and its potential in a physical setup.

Hyperbolic Procrustes Analysis Using Riemannian Geometry (NeurIPS 2021)

Ya-Wei Eileen Lin, Yuval Kluger, Ronen Talmon

Label-free alignment between datasets collected at different times, locations, or by different instruments is a fundamental scientific task. Hyperbolic spaces have recently provided a fruitful foundation for the development of informative representations of hierarchical data. Here, we take a purely geometric approach for label-free alignment of hierarchical datasets and introduce hyperbolic Procrustes analysis (HPA). HPA consists of new implementations of the three prototypical Procrustes analysis components: translation, scaling, and rotation, based on the Riemannian geometry of the Lorentz model of hyperbolic space. We analyze the proposed components, highlighting their useful properties for alignment. The efficacy of HPA, its theoretical properties, stability and computational efficiency are demonstrated in simulations. In addition, we showcase its performance on three batch correction tasks involving gene expression and mass cytometry data. Specifically, we demonstrate high-quality unsupervised batch effect removal from data acquired at different sites and with different technologies that outperforms recent methods for label-free alignment in hyperbolic spaces.

Joint Geometric and Topological Analysis of Hierarchical Datasets (ECML PKDD 2021)

Lior Aloni, Omer Bobrowski, Ronen Talmon

 In a world abundant with diverse data arising from complex acquisition techniques, there is a growing need for new data analysis methods. In this paper we focus on high-dimensional data that are organized into several hierarchical datasets. We assume that each dataset consists of complex samples, and every sample has a distinct irregular structure modeled by a graph. The main novelty in this work lies in the combination of two complementing powerful data-analytic approaches: topological data analysis (TDA) and geometric manifold learning. Geometry primarily contains local information, while topology inherently provides global descriptors. Based on this combination, we present a method for building an informative representation of hierarchical datasets. At the finer (sample) level, we devise a new metric between samples based on manifold learning that facilitates quantitative structural analysis. At the coarser (dataset) level, we employ TDA to extract qualitative structural information from the datasets. We showcase the applicability and advantages of our method on simulated data and on a corpus of hyper-spectral images. We show that an ensemble of hyper-spectral images exhibits a hierarchical structure that fits well the considered setting. In addition, we show that our new method gives rise to superior classification results compared to state-of-the-art methods.

Metric Learning and Domain Adaptation for High-Demensional Data (Ph.D. seminar)

Almong Lahav, Supervised by Prof. Ronen Talmon

Symmetric positive semi-definite (SPSD) matrices are common data features in contemporary data analysis. Notable examples for such features are (low rank) covariance matrices, various kernel matrices, and graph Laplacians. We present new results on the Riemannian geometry of SPSD matrices, leading to a convenient mathematical framework for developing data analysis methods that rely on these useful data features. Based on the new mathematical framework, we propose two algorithms for Domain Adaptation. We show that these algorithms can be applied not only to SPSD matrices but also to any high-dimensional data sets with an intrinsic low-dimensional structure. The performance of the algorithms is demonstrated in applications to real data, including Hyper-Spectral Imaging (HSI), motion recognition and electroencephalographic (EEG) .

Adversarially Robust Conformal Prediction (ICLR 2022)

Asaf Gendler, Tsui-Wei Weng, Luca Daniel, Yaniv Romano

Conformal prediction is a model-agnostic tool for constructing prediction sets that are valid under the common i.i.d. assumption, which has been applied to quantify the prediction uncertainty of deep net classifiers. In this paper, we generalize this framework to the case where adversaries exist during inference time, under which the i.i.d. assumption is grossly violated. By combining conformal prediction with randomized smoothing, our proposed method forms a prediction set with finite-sample coverage guarantee that holds for any data distribution with ℓ2-norm bounded adversarial noise, generated by any adversarial attack algorithm. The core idea is to bound the Lipschitz constant of the non-conformity score by smoothing it with Gaussian noise and leverage this knowledge to account for the effect of the unknown adversarial perturbation. We demonstrate the necessity of our method in the adversarial setting and the validity of our theoretical guarantee on three widely used benchmark data sets: CIFAR10, CIFAR100, and ImageNet.

Improving Conditional Coverage via Orthogonal Quantile Regression (NeurIPS 2021)

Shai Feldman, Stephen Bates, Yaniv Romano

We develop a method to generate prediction intervals that have a user-specified coverage level across all regions of feature-space, a property called conditional coverage. A typical approach to this task is to estimate the conditional quantiles with quantile regression - it is well-known that this leads to correct coverage in the large-sample limit, although it may not be accurate in finite samples. We find in experiments that traditional quantile regression can have poor conditional coverage. To remedy this, we modify the loss function to promote independence between the size of the intervals and the indicator of a miscoverage event. For the true conditional quantiles, these two quantities are independent (orthogonal), so the modified loss function continues to be valid. Moreover, we empirically show that the modified loss function leads to improved conditional coverage, as evaluated by several metrics. We also introduce two new metrics that check conditional coverage by looking at the strength of the dependence between the interval size and the indicator of miscoverage.