KDD2018

Learning Structural Node Embeddings Via Diffusion Wavelets

Claire Donnat, Marinka Zitnik, David Hallac, Jure Leskovec

(Submitted on 27 Oct 2017 (v1), last revised 20 Jun 2018 (this version, v4))
Nodes residing in different parts of a graph can have similar structural roles

KDD2018

Learning Structural Node Embeddings Via Diffusion Wavelets

Claire Donnat, Marinka Zitnik, David Hallac, Jure Leskovec

(Submitted on 27 Oct 2017 (v1), last revised 20 Jun 2018 (this version, v4))
Nodes residing in different parts of a graph can have similar structural roles within their local network topology. The identification of such roles provides key insight into the organization of networks and can be used for a variety of machine learning tasks. However, learning structural representations of nodes is a challenging problem, and it has typically involved manually specifying and tailoring topological features for each node. In this paper, we develop GraphWave, a method that represents each node's network neighborhood via a low-dimensional embedding by leveraging heat wavelet diffusion patterns. Instead of training on hand-selected features, GraphWave learns these embeddings in an unsupervised way. We mathematically prove that nodes with similar network neighborhoods will have similar GraphWave embeddings even though these nodes may reside in very different parts of the network, and our method scales linearly with the number of edges. Experiments in a variety of different settings demonstrate GraphWave's real-world potential for capturing structural roles in networks, and our approach outperforms existing state-of-the-art baselines in every experiment, by as much as 137%.

arXiv:1811.12804

Abstract: This paper is concerned with a curious phenomenon in spectral estimation. Suppose we are interested in a rank-1 and symmetric matrix M⋆∈Rn×n, yet only a randomly perturbed version M is observed. The perturbation,/,noise matrix M−M⋆ is composed of independent

]]>arXiv:1811.12804

Abstract: This paper is concerned with a curious phenomenon in spectral estimation. Suppose we are interested in a rank-1 and symmetric matrix M⋆∈Rn×n, yet only a randomly perturbed version M is observed. The perturbation,/,noise matrix M−M⋆ is composed of independent and zero-mean entries and is not symmetric. This might arise, for example, when we have two independent samples for each entry of M⋆ and arrange them into an {\em asymmetric} data matrix M. The aim is to estimate the leading eigenvalue and eigenvector of M⋆. Somewhat unexpectedly, our findings reveal that the leading eigenvalue of the data matrix M can be n−−√ times more accurate than its leading singular value in eigenvalue estimation. Further, the perturbation of any linear form of the leading eigenvector of M (e.g.~entrywise eigenvector perturbation) is provably well-controlled. We further provide partial theory for the more general rank-r case; this allows us to accommodate the case when M⋆ is rank-1 but asymmetric, by considering eigen-decomposition of the associated rank-2 dilation matrix. The takeaway message is this: arranging the data samples in an asymmetric manner and performing eigen-decomposition (as opposed to SVD) could sometimes be quite beneficial.

]]>ICLR2019:https://openreview.net/forum?id=HkgEQnRqYQ¬eId=HkgEQnRqYQ

Abstract: We study the problem of learning representations of entities and relations in knowledge graphs for predicting missing links. The success of such a task heavily relies on the ability of modeling and inferring the patterns of

]]>ICLR2019:https://openreview.net/forum?id=HkgEQnRqYQ¬eId=HkgEQnRqYQ

Abstract: We study the problem of learning representations of entities and relations in knowledge graphs for predicting missing links. The success of such a task heavily relies on the ability of modeling and inferring the patterns of (or between) the relations. In this paper, we present a new approach for knowledge graph embedding called RotatE, which is able to model and infer various relation patterns including: symmetry/antisymmetry, inversion, and composition. Specifically, the RotatE model defines each relation as a rotation from the source entity to the target entity in the complex vector space. In addition, we propose a novel self-adversarial negative sampling technique for efficiently and effectively training the RotatE model. Experimental results on multiple benchmark knowledge graphs show that the proposed RotatE model is not only scalable, but also able to infer and model various relation patterns and significantly outperform existing state-of-the-art models for link prediction.

]]>Speaker: Jinze Wu

**shape descriptor**

K. Gebal, J. Bærentzen, H. Aanæs, and R. Larsen. Shape
analysis using the auto diffusion function. Computer Graphics

Forum, 28(5):1405–1413, 2009

J. Sun, M. Ovsjanikov, and L. Guibas. A Concise and Provably
Informative Multi-Scale Signature Based on Heat Diffusion.

Computer

Speaker: Jinze Wu

**shape descriptor**

K. Gebal, J. Bærentzen, H. Aanæs, and R. Larsen. Shape
analysis using the auto diffusion function. Computer Graphics

Forum, 28(5):1405–1413, 2009

J. Sun, M. Ovsjanikov, and L. Guibas. A Concise and Provably
Informative Multi-Scale Signature Based on Heat Diffusion.

Computer Graphics Forum, 28(5):1383–1392, 2009.

The wave kernel signature: A quantum mechanical approach to shape analysis http://imagine.enpc.fr/~aubrym/projects/wks

Pose-Consistent 3D Shape Segmentation Based on a Quantum Mechanical Feature Descriptor PR2011

learning spectral descriptors for deformable shape correspondence (PAMI)

**Fucntional Map**

Functional maps: a flexible representation of maps between shapes

Kernel Functional map SGP2018

**Other Discripitor**

Nonisometric Surface Registration via Conformal Laplace-Beltrami Basis Pursuit arXiv:1809.07399

Lai R, Zhao H. Multi-scale Non-Rigid Point Cloud Registration Using Robust Sliced-Wasserstein Distance via Laplace-Beltrami Eigenmap[J]. Mathematics, 2014, 10(2).

]]>Hold: Shicong Cen

Centralized Optimization

Communication Efficient Distributed Optimization using an Approximate Newton-type Method (Ohad Shamir, Nathan Srebro, Tong Zhang, 2013)

- DANE

Communication-Efficient Distributed Dual Coordinate Ascent (Martin Jaggi, Virginia Smith, Martin Takac, Jonathan Terhorst, Sanjay Krishnan, Thomas Hofmann, Michael I. Jordan, 2014)

- CoCoA
- Distributed dual update

Hold: Shicong Cen

Centralized Optimization

Communication Efficient Distributed Optimization using an Approximate Newton-type Method (Ohad Shamir, Nathan Srebro, Tong Zhang, 2013)

- DANE

Communication-Efficient Distributed Dual Coordinate Ascent (Martin Jaggi, Virginia Smith, Martin Takac, Jonathan Terhorst, Sanjay Krishnan, Thomas Hofmann, Michael I. Jordan, 2014)

- CoCoA
- Distributed dual update for loss functions of linear predictor

DiSCO: Distributed Optimization for Self-Concordant Empirical Loss (Yuchen Zhang, Lin Xiao, 2015)

- Damped Newton + Distributed PCG

AIDE: Fast and Communication Efficient Distributed Optimization(Sashank J. Reddi, Jakub Konečný, Peter Richtárik, Barnabás Póczós, Alex Smola, 2016)

- DANE + Nesterov's acceleration
- Convergence analysis of inexact DANE

GIANT: Globally Improved Approximate Newton Method for Distributed Optimization (Shusen Wang, Farbod Roosta-Khorasani, Peng Xu, Michael W. Mahoney, 2017)

- Distributed Newton steps

Decentralized Optimization

On the Convergence of Decentralized Gradient Descent (Kun Yuan, Qing Ling, Wotao Yin, 2013)

D-ADMM: A communication-efﬁcient distributed algorithm for separable optimization (João F. C. Mota, João M. F. Xavier, Pedro M. Q. Aguiar, and Markus Püschel, 2013)

On the Linear Convergence of the ADMM in Decentralized Consensus Optimization (Wei Shi, Qing Ling, Kun Yuan, Gang Wu, and Wotao Yin, 2014)

EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization (Wei Shi, Qing Ling, Gang Wu, Wotao Yin, 2014)

Network Newton (Aryan Mokhtari, Qing Ling, Alejandro Ribeiro, 2014)

- Truncated Taylor series of Newton step applied to regularized problem

A Decentralized Second-Order Method with Exact Linear Convergence Rate for Consensus Optimization (Aryan Mokhtari, Wei Shi, Qing Ling, and Alejandro Ribeiro, 2016)

- Truncated Taylor series of Newton step applied to augmented Lagrangian function

DSA: Decentralized Double Stochastic Averaging Gradient Algorithm Aryan (Aryan Mokhtari, Alejandro Ribeiro, 2016)

- EXTRA + SVRG trick

Exact Diffusion for Distributed Optimization and Learning --- Part I: Algorithm Development (Kun Yuan, Bicheng Ying, Xiaochuan Zhao, Ali H. Sayed, 2017)

Exact Diffusion for Distributed Optimization and Learning --- Part II: Convergence Analysis (Kun Yuan, Bicheng Ying, Xiaochuan Zhao, Ali H. Sayed, 2017)

A Communication-Efficient Random-Walk Algorithm for Decentralized Optimization (Wotao Yin, Xianghui Mao, Kun Yuan, Yuantao Gu, Ali H. Sayed, 2018)

- Walk ADMM

FOR SEMI-SUPERVISED LEARNING

Speaker: Jingfeng Wu

https://arxiv.org/pdf/1808.06088.pdf

The ever-increasing size of modern datasets combined with the difficulty of obtaining label information has made semi-supervised learning of significant practical importance in modern machine learning applications. Compared with supervised learning, the key difficulty

]]>FOR SEMI-SUPERVISED LEARNING

Speaker: Jingfeng Wu

https://arxiv.org/pdf/1808.06088.pdf

The ever-increasing size of modern datasets combined with the difficulty of obtaining label information has made semi-supervised learning of significant practical importance in modern machine learning applications. Compared with supervised learning, the key difficulty in semi-supervised learning is how to make full use of the unlabeled data. In order to utilize manifold information provided by unlabeled data, we propose a novel regularization called the tangent-normal adversarial regularization, which is composed by two parts. The two terms complement with each other and jointly enforce the smoothness along two different directions that are crucial for semi-supervised learning. One is applied along the tangent space of the data manifold, aiming to enforce local invariance of the classifier on the manifold, while the other is performed on the normal space orthogonal to the tangent space, intending to impose robustness on the classifier against the noise causing the observed data deviating from the underlying data manifold. Both of the two regularizers are achieved by the strategy of virtual adversarial training.

Our method has achieved state-of-the-art performance on semi-supervised learning tasks on both artificial dataset and FashionMNIST dataset.

Hold: Yiping Lu

On the Convergence of Learning-based Iterative

Methods for Nonconvex Inverse Problems

Risheng Liu, Member, IEEE, Shichao Cheng, Yi He, Xin Fan, Member, IEEE, Zhouchen Lin, Fellow, IEEE,

and Zhongxuan Luo

https://arxiv.org/pdf/1808.05331.pdf

Theoretical linear convergence

]]>Hold: Yiping Lu

On the Convergence of Learning-based Iterative

Methods for Nonconvex Inverse Problems

Risheng Liu, Member, IEEE, Shichao Cheng, Yi He, Xin Fan, Member, IEEE, Zhouchen Lin, Fellow, IEEE,

and Zhongxuan Luo

https://arxiv.org/pdf/1808.05331.pdf

Theoretical linear convergence of unfolded ISTA and its practial weight and thresholds

Xiaohan Chen, Jialin Liu, Zhangyang Wang, Wotao Yin

Hao He, Bo Xin, Satoshi Ikehata, and David Wipf, "From Bayesian Sparsity to Gated Recurrent Nets," Advances in Neural Information Processing Systems (NIPS), 2017.

http://papers.nips.cc/paper/7139-from-bayesian-sparsity-to-gated-recurrent-nets

Bo Xin, Yizhou Wang, Wen Gao, Baoyuan Wang, and David Wipf, "Maximal Sparsity with Deep Networks?," Advances in Neural Information Processing Systems (NIPS), 2016.

Bora, Ashish, et al. "Compressed sensing using generative models." arXiv preprint arXiv:1703.03208 (2017).

Kabkab, Maya, Pouya Samangouei, and Rama Chellappa. "Task-Aware Compressed Sensing with Generative Adversarial Networks." arXiv preprint arXiv:1802.01284 (2018).

Van Veen, David, et al. "Compressed Sensing with Deep Image Prior and Learned Regularization." arXiv preprint arXiv:1806.06438 (2018).

]]>Hold: Yiping Lu

Point Integral Method for Solving Poisson-type Equations on

Manifolds from Point Clouds with Convergence Guarantees

https://arxiv.org/pdf/1409.2623.pdf

A METHOD BASED ON TOTAL VARIATION FOR NETWORK

MODULARITY OPTIMIZATION USING THE MBO SCHEME

https://arxiv.org/pdf/

Hold: Yiping Lu

Point Integral Method for Solving Poisson-type Equations on

Manifolds from Point Clouds with Convergence Guarantees

https://arxiv.org/pdf/1409.2623.pdf

A METHOD BASED ON TOTAL VARIATION FOR NETWORK

MODULARITY OPTIMIZATION USING THE MBO SCHEME

https://arxiv.org/pdf/1304.4679.pdf

Low dimensional manifold model for image processing

SIAM Journal on Imaging Sciences 2017

An MBO scheme on graphs for classification and image processing

SIAM Journal on Imaging Sciences, 2013 - SIAM

Zhou, Xueyuan, and Mikhail Belkin. "Semi-supervised learning by higher order regularization." Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. 2011.

A. Bertozzi and A. Flenner, Diffuse interface models of graphs for classification of high dimensional

data, Multiscale Model. Simul., 10 (2012), pp. 1090–1118.

Natural Boundary Conditions for Smoothing in Geometry Processing

ACM Transaction on Graphics 2018

http://www.cs.columbia.edu/cg/hessians/

Stochastic Block Models are a Discrete Surface Tension

https://arxiv.org/pdf/1806.02485

Bayesian Semi-supervised Learning with Graph Gaussian Processes

https://arxiv.org/pdf/1809.04379v1.pdf

Local High-order Regularization on Data Manifolds

https://arxiv.org/abs/1602.03805

=================

Balcan, Maria-Florina, and Avrim Blum. "A PAC-style model for learning from labeled and unlabeled data." International Conference on Computational Learning Theory. Springer, Berlin, Heidelberg, 2005.

SIMPLIFIED ENERGY LANDSCAPE FOR MODULARITY USING TOTAL

VARIATION

https://arxiv.org/pdf/1707.09285.pdf

Jacobs, Matt, Ekaterina Merkurjev, and Selim Esedoḡlu. "Auction dynamics: A volume constrained MBO scheme." Journal of Computational Physics 354 (2018): 288-310.

]]>a cyclegan

image --> restored image is a generator

but the "generator": restored image-->image is fixed by the physic behind the inverse problem

]]>a cyclegan

image --> restored image is a generator

but the "generator": restored image-->image is fixed by the physic behind the inverse problem

]]>two ideas:

downsampled subimage + noise level map as input

two ideas:

downsampled subimage + noise level map as input

very simple idea of choosing the stop time of a filter.

let filter get

observation = image+noise

noise should be independent of the image

**question:** This assumption

very simple idea of choosing the stop time of a filter.

let filter get

observation = image+noise

noise should be independent of the image

**question:** This assumption seems not ture while the noise is not AWG

a network supervised by generated rains

fine tuned on real data

inspired by

Wei, W., Yi, L., Xie, Q., Zhao, Q., Meng, D., Xu, Z.: Should we encode rain streaks
in video as deterministic or stochastic? In: 2017 IEEE International Conference

on Computer

a network supervised by generated rains

fine tuned on real data

inspired by

Wei, W., Yi, L., Xie, Q., Zhao, Q., Meng, D., Xu, Z.: Should we encode rain streaks
in video as deterministic or stochastic? In: 2017 IEEE International Conference

on Computer Vision (ICCV). Volume 00. (Oct. 2018) 2535–2544

**Idea of this series of work is also interesting:** the data term is learned by a em algorithm and this seems fits better on real data.[they have a serious of work publish in all aspect of image restoration.]

We design an EM algorithm together with a gradient descent strategy to

solve the proposed model. The P-MoG parameters and network parameters

can be optimized by sequence in each iteration. Experiments implemented on

synthetic rainy images and especially real ones substantiate the superiority

of the proposed method as compared with the state-of-the-arts along this

line.

(CVPR2018)

talk: link

using closed form solution of the momentums of gaussian after nonlinear activation.....to do uq in deep learning

]]>(CVPR2018)

talk: link

using closed form solution of the momentums of gaussian after nonlinear activation.....to do uq in deep learning

]]>