We’re interested in machine learning, optimization algorithms and text understanding, as well as several application domains.
Accelerating Gradient Boosting Machine2020. 23rd International Conference on Artificial Intelligence and Statistics, Palermo, Sicily, Italy, June 3 – 5, 2020.
On the Relationship between Self-Attention and Convolutional Layers2020. Eighth International Conference on Learning Representations – ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020.
Stochastic Zeroth-Order Optimisation Algorithms with Variance ReductionEPFL, 2019-06-21.
A comparison of model-parallel training methods for deep learning2019-06-18.
Decentralized Stochastic Optimization and Gossip Algorithms with Compressed Communication2019-06-09. ICML 2019 – International Conference on Machine Learning, Long Beach, California, USA, 9-15 June 2019.
Local SGD Converges Fast and Communicates Little2019-05-06. ICLR 2019 – International Conference on Learning Representations, New Orleans, USA, May 6-9, 2019.
Error Feedback Fixes SignSGD and other Gradient Compression Schemes2019. 36th International Conference on Machine Learning (ICML) 2019, Long Beach, USA, June 9-15, 2019. p. 3252-3261.
Overcoming Multi-model Forgetting2019. ICML 2019 – 36th International Conference on Machine Learning, Long Beach, California, USA, June 09-15, 2019. p. 594-603.
Open-Vocabulary Keyword Spotting with Audio and Text Embeddings2019. INTERSPEECH 2019 – IEEE International Conference on Acoustics, Speech, and Signal Processing, Graz, Austria,
Better Word Embeddings by Disentangling Contextual n-Gram Information2019. NAACL 2019 – Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Minnesota, USA, June 2-7. 2019.
Training DNNs with Hybrid Block Floating Point2018-12-04. NeurIPS 2018 – Neural Information Processing Systems, Montréal Canada, December 2-8, 2018.
Accelerated Stochastic Matrix Inversion: General Theory and Speeding up BFGS Rules for Faster Second-Order Optimization2018-12-02. 32nd Annual Conference on Neural Information Processing Systems (NIPS), Montréal, Canada, December 2-8, 2018.
Sparsified SGD with Memory2018-12-02. NeurIPS 2018 – 32nd Annual Conference on Neural Information Processing Systems, Montréal, Canada, December 2-8, 2018.
Unsupervised Learning of Representations for Lexical Entailment Detection2018-09-04.
Convex Optimization using Sparsified Stochastic Gradient Descent with Memory2018-06-27.
Unsupervised learning of sentence embeddings using compositional n-gram features2018-05-01. NAACL 2018 – Conference of the North American Chapter of the Association for Computational Linguistics.
Simple Unsupervised Keyphrase Extraction using Sentence Embeddings2018-05-01. CoNLL 2018 – SIGNLL Conference on Computational Natural Language Learning.
On Matching Pursuit and Coordinate Descent2018-05-01. ICML 2018 – 35th International Conference on Machine Learning.
Adaptive balancing of gradient and update computation times using global geometry and approximate subproblems2018-05-01.
Learning Word Vectors for 157 Languages2018-02-19. Language Resources and Evaluation Conference, Miyazaki, Japan, May 7-12, 2018.
A Distributed Second-Order Algorithm You Can Trust2018. 35th International Conference on Machine Learning (ICML 2018), Stockholm, Sweden, 10-15 July 2018. p. 1358-1366.
Training DNNs with Hybrid Block Floating Point2018-01-01. NeurIPS 2018 – 32nd Conference on Neural Information Processing Systems, Montreal, CANADA, Dec 02-08, 2018.
CoCoA: A General Framework for Communication-Efficient Distributed OptimizationJournal Of Machine Learning Research. 2018-01-01. Vol. 18.
Some results on a class of mixed van der Waerden numbersRocky Mountain Journal of Mathematics. 2018. Vol. 48, num. 3, p. 885-904. DOI : 10.1216/RMJ-2018-48-3-885.
Gene locations may contribute to predicting gene regulatory relationshipsJOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE B. 2018. Vol. 19, num. 1, p. 25-37. DOI : 10.1631/jzus.B1700303.
Prediction of patient-reported physical activity scores from wearable accelerometer data: a feasibility study2018. ICNR2018 – International Conference on NeuroRehabilitation, Pisa, Italy, October 16 – 20, 2018.
Asynchronous updates for stochastic gradient descent2017-06-09.
Fully Quantized Distributed Gradient Descent2017.
Safe Adaptive Importance Sampling2017. Neural Information Processing Systems (NIPS), Long Beach, USA, December 4-9, 2017.
Approximate Steepest Coordinate Descent2017. ICML 2017 – International Conference on Machine Learning, Sydney, Australia, Aug 6-11, 2017.
Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features
A Unified Optimization View on Generalized Matching Pursuit and Frank-Wolfe2017. 20th International Conference on Artificial Intelligence and Statistics (AISTATS) 2017, Fort Lauderdale, California, United States, April 20-22, 2017.
Faster Coordinate Descent via Adaptive Importance Sampling2017. 20th International Conference on Artificial Intelligence and Statistics (AISTATS) 2017, Fort Lauderdale, Florida, USA, April 20-22, 2017.
CoCoA: A General Framework for Communication-Efficient Distributed Optimization2016.
Screening Rules for Convex Problems
Pursuits in Structured Non-Convex Matrix Factorizations
Primal-Dual Rates and Certificates2016. ICML 2016 – International Conference on Machine Learning, USA, NY, New York City, June 19-24, 2016. p. 783-792.
Audio Based Bird Species Identification using Deep Learning Techniques2016. Conference and Labs of the Evaluation Forum (CLEF) 2016, Évora, Portugal, 5-8 September, 2016. p. 547-559.
L1-Regularized Distributed Optimization: A Communication-Efficient Primal-Dual Framework
On the Global Linear Convergence of Frank-Wolfe Optimization Variants2015. Neural Information Processing Systems (NIPS) 2015, Montreal, Quebec, Canada, December 7-12, 2015. p. 496-504.
Adding vs. Averaging in Distributed Primal-Dual Optimization2015. ICML 2015 – International Conference on Machine Learning, Lille, France, 6-11 July 2015. p. 1973-1982.
An Equivalence between the Lasso and Support Vector MachinesRegularization, Optimization, Kernels, and Support Vector Machines; Chapman and Hall/CRC, 2014. p. 1-26.
Communication-Efficient Distributed Dual Coordinate Ascent.2014. Neural Information Processing Systems (NIPS) 2014, Montreal, Quebec, Canada, December 8-13 2014. p. 3068-3076.
Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization2013. ICML 2013 – International Conference on Machine Learning, Atlanta, GA, USA, 16-21 June 2013. p. 427-435.
Block-Coordinate Frank-Wolfe Optimization for Structural SVMs2013. ICML 2013 – International Conference on Machine Learning, Atlanta, GA, USA, 16-21 June 2013. p. 53-61.
Regularization Paths with Guarantees for Convex Semidefinite Optimization.2012. International Conference on Artificial Intelligence and Statistics (AISTATS) 2012, La Palma, Canary Islands, April 21-23, 2012. p. 432-439.
Optimizing over the Growing Spectrahedron2012. European Symposia on Algorithms (ESA) 2012, Ljubljana, Slovenia, September 10-12, 2012. p. 503-514. DOI : 10.1007/978-3-642-33090-2_44.
Approximating parameterized convex optimization problemsACM Trans. Algorithms. 2012. Vol. 9, num. 1, p. 10:1-10:17. DOI : 10.1145/2390176.2390186.
Sparse convex optimization methods for machine learningETH Zürich, 2011.
Convex Optimization without Projection Steps
A Simple Algorithm for Nuclear Norm Regularized Problems.2010. International Conference on Machine Learning (ICML) 2010, Haifa, Israel, June 21-24, 2010. p. 471-478.
Approximating Parameterized Convex Optimization Problems.2010. European Symposia on Algorithms (ESA) 2010, Liverpool, UK, September 6-8, 2010. p. 524-535. DOI : 10.1007/978-3-642-15775-2_45.
Approximating Parameterized Convex Optimization ProblemsEuropean Symposia on Algorithms (ESA) 2010; Springer Berlin Heidelberg, 2010. p. 524-535.
A Combinatorial Algorithm to Compute Regularization Paths