I am a Researcher in the Adaptive Systems and Interaction group at MSR AI. My research work lies in the intersection of human and machine intelligence. I am currently excited about two main directions in this realm: Human-AI Collaboration for enhancing human capabilities while solving complex tasks, as well as Troubleshooting and Failure Analysis for AI\ML systems for improving and accelerating the software development lifecycle of intelligent systems. Moreover, I am also involved in various research initiatives and projects that study the societal impact of artificial intelligence as well as various quality-of-service aspects of AI including interpretability, transparency, accountability, and fairness. If you are a PhD student looking for an internship position around these topics send me an email. The Adaptive Systems and Interaction group is a fun bunch of excellent and diverse researchers.

Prior to joining MSR AI, in 2016 I completed my PhD degree at ETH Zurich (Switzerland) in the Systems Group, supervised by Prof. Donald Kossmann and Prof. Andreas Krause. My doctoral thesis focuses on building cost and quality-aware models for integrating crowdsourcing in the process of building machine learning algorithms and systems. In 2011, I completed my master studies in computer science in a double-degree MSc program at RWTH University of Aachen (Germany) and University of Trento (Italy) as an Erasmus Mundus scholar. I also have a Diploma in Informatics from University of Tirana (Albania) from where I graduated in 2007.

Machine learning models are currently optimized to maximize the model performance on given benchmarks and test datasets. When a learning model is being used by a human to either accomplish a complex task or to take a high-stake decision (e.g. medical diagnosis or recidivism), team performance does not only depend on model performance but also on how well do humans understand when to trust the AI or not so that they can learn when to override its decisions. In this project, we aim at instead optimizing for joint human-model performance. Our first steps towards this goal have been to study how models should be updated so that they do not violate previous trust that users might have built during their interaction over time. By incorporating in the loss function the compatibility to the previous model, and therefore to the previous potential user experience, we minimize update disruption for the whole team. To facilitate studies in this field we developed the CAJA platform, which supports parameterized user studies.

Updates in a human-AI team
Human-AI teams undergoing a model update

Collaborators: Gagan Bansal (University of Washington), Ece Kamar (Microsoft Research), Dan Weld (University of Washington), Walter Lasecki (University of Michigan), Eric Horvitz (Microsoft Research)

Building reliable AI requires a deep understanding of the potential system failures. The focus of this project is to build tools that can help engineers to accelerate development and improvement cycles by assisting them in debugging and troubleshooting. For example, Pandora is a set of hybrid human-machine methods and tools for describing and explaining system failures. It provides descriptive performance reports to engineers correlating input conditions with errors, guiding them towards discovering hidden conditions of failure.

Pandora error analysis workflow
Pandora workflow for error analysis

In the same vein, this project has also explored ideas on enabling troubleshooting techniques with humans in the loop. Diagnosing and fixing a complex AI system is a challenging task. Often, errors get propagated, suppressed or even amplified down the computation pipelines. We propose a troubleshooting methodology that generates counterfactual improved states of system components by using crowd intelligence. These states, which would have been too expensive or infeasible to generate otherwise, are then integrated in the system execution to create insights about which component fixes are the most efficient given the current system architecture.
Human in the loop troubleshooting
Troubleshooting Integrative AI systems with humans in the loop

Collaborators: Ece Kamar (Microsoft Research), Lydia Manikonda (Arizona State University), Eric Horvitz (Microsoft Research), Donald Kossmann

Quality assurance is one the most important challenges in crowdsourcing. Assigning tasks to several workers to increase quality through redundant answers can be expensive if asking homogeneous sources. In this project, we look at various crowd access optimization techniques that can be applied either while building training models with crowdsourced data or while applying such models to make crowdsourced predictions.

In the context of crowdsourced predictions, our work argues that optimization needs to be aware of diversity and correlation of information within groups of individuals so that crowdsourcing redundancy can be adequately planned beforehand. Based on this intuitive idea, we introduce the Access Path Model (APM), a novel crowd model that leverages the notion of access paths as an alternative way of retrieving information. The access path configuration can be based on various criteria depending on the task: (i) workers’ demographics (e.g. profession, group of interest, age) (ii) the source of information or the tool that is used to find the answer (e.g. phone call vs. web page, Bing vs. Google) (iii) task design (e.g. time of completion, user interface) (iv) task decomposition (e.g. part of the answers, features). APM aggregates answers ensuring high quality and meaningful confidence. Moreover, we devise a greedy optimization algorithm for this model that finds a provably good approximate plan to access the crowd.

Access Path Model
The Access Path Model applied on Medical Questions and Answers

In addition, we have devised the B-LEAFS algorithm for building machine learning models with crowdsourced input features under budget constraints. The main challenge we addressed with this algorithm is related to the natural exploration and exploitation trade-offs in crowdsourcing between noisy redundancy features and the number of observed examples.
Collaborators: Anja Gruenheid, Adish Singla, Erfan Zamanian, Andreas Krause, Donald Kossmann

The goal of this project is to develop a set of novel techniques that allow to integrate human resources into a database system in order to process some of the impossible queries that Google and Oracle cannot answer today and address some of the notoriously hard database research problems in a very different way as has been done in the past. Specifically, CrowdDB extends a relational database system and is able to process both conventional and crowdsourced data. For this purpose, we have designed and implemented various algorithms for quality management and query processing. Moreover, the project has been focused on implementing crowdsourced query operators for entity resolution, joins, comparisons, and sorting.

CrowdDB Architecture
CrowdDB Architecture

Collaborators: Anja Gruenheid, Donald Kossmann, Lynn Aders, Erfan Zamanian


Software Engineering for Machine Learning: A Case Study. Saleema Amershi, Andrew Begel, Christian Bird, Rob DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, Thomas Zimmermann; ICSE 2019. (to appear)

Guidelines for Human-AI Interaction. Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul Bennett, Kori Inkpen, Jaime Teevan, Ruth Kikin-Gil, Eric Horvitz; CHI 2019. pdf

Updates in Human-AI Teams: Understanding and Addressing the Performance/Compatibility Tradeoff. Gagan Bansal, Besmira Nushi, Ece Kamar, Daniel S Weld, Walter S Lasecki, Eric Horvitz; AAAI 2019. pdf

Overcoming Blind Spots in the RealWorld: Leveraging Complementary Abilities for Joint Execution. Ramya Ramakrishnan, Ece Kamar, Besmira Nushi, Debadeepta Dey, Julie Shah, Eric Horvitz; AAAI 2019. pdf

Towards Accountable AI: Hybrid Human-Machine Analyses for Characterizing System Failure. Besmira Nushi, Ece Kamar, Eric Horvitz; HCOMP 2018. pdf

Analysis of Strategy and Spread of Russia-sponsored Content in the US in 2017. Alexander Spangher, Gireeja Ranade, Besmira Nushi, Adam Fourney, Eric Horvitz; arXiv 2018. pdf

On Human Intellect and Machine Failures: Troubleshooting Integrative Machine Learning Systems. Besmira Nushi, Ece Kamar, Eric Horvitz, Donald Kossmann; AAAI 2017. pdf

Quality Control and Optimization for Hybrid Crowd-Machine Learning Systems. Besmira Nushi; ETH PhD Thesis 2016. pdf

Learning and Feature Selection under Budget Constraints in Crowdsourcing. Besmira Nushi, Adish Singla, Andreas Krause, Donald Kossmann; HCOMP 2016. pdf

Fault-Tolerant Entity Resolution with the Crowd. Anja Gruenheid, Besmira Nushi, Tim Kraska, Wolfgang Gatterbauer, Donald Kossmann; arXiv 2016. full technical report

Crowd Access Path Optimization: Diversity Matters. Besmira Nushi, Adish Singla, Anja Gruenheid, Erfan Zamanian, Andreas Krause, Donald Kossmann; HCOMP 2015. pdf

CrowdSTAR: A Social Task Routing Framework for Online Communities. Besmira Nushi, Omar Alonso, Martin Hentschel, and Vasileios Kandylas; ICWE 2015. pdf full technical report

When is A = B? Anja Gruenheid, Donald Kossmann, Besmira Nushi, Yuri Gurevich; EATCS Bulletin 111 (2013) pdf

Uncertain time-series similarity: Return to the basics. Michele Dallachiesa, Besmira Nushi, Katsiaryna Mirylenka, and Themis Palpanas; Proceedings of the VLDB Endowment 5, no. 11 (2012): 1662-1673. pdf

Similarity matching for uncertain time series: analytical and experimental comparison. Michele Dallachiesa, Besmira Nushi, Katsiaryna Mirylenka, and Themis Palpanas. Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Querying and Mining Uncertain Spatio-Temporal Data, pp. 8-15. ACM, 2011. pdf

Computational Intelligence Lab, Spring 2016, ETH Zurich
Machine Learning, Autumn 2015, ETH Zurich 
Computational Intelligence Lab, Spring 2015, ETH Zurich
Big Data, Fall 2014, ETH Zurich
Data Modelling and Databases, Spring 2014, ETH Zurich
Data Modelling and Databases, Spring 2013, ETH Zurich
Big Data, Fall 2012, ETH Zurich
Software Engineering, UML, C++ Programming, University of Tirana

Microsoft Research AI (MSR AI) is a new organization that brings together the breadth of talent across Microsoft Research to pursue game-changing advances in artificial intelligence. The new research and development initiative combines advances in machine learning with innovations in language and dialog, human computer interaction, and computer vision to solve some of the toughest challenges in AI. A key focus for this initiative is to probe the foundational principles of intelligence, including efforts to unravel the mysteries of human intellect, and use this knowledge to develop a more general, flexible artificial intelligence. MSR AI pursues use of machine intelligence in new ways to empower people and organizations, including systems that deliver new experiences and capabilities that help people be more efficient, engaged and productive.

On machine learning and artificial intelligence
Talking Machines Podcast Human conversation about machine learning
The master algorithm by Pedro Domingos: Talk | Book
The Long-Term Future of (Artificial) Intelligence by Stuart Russell: Talk
Artificial Intelligence: A Modern Approach: Book

On collective intelligence and crowdsourcing
Handbook of Human Computation: Book
HCOMP - AAAI Conference on Human Computation and Crowdsourcing
Wisdom of the Crowds by James Surowiecki: TED talk | Book
The Global Brain by Peter Russell: Book


Microsoft building 99 (3117)
14820 NE 36th St, Redmond, WA 98052, USA