I am a Researcher in the Adaptive Systems and Interaction group at MSR AI. My research work lies in the intersection of human and machine intelligence. I am interested in developing models and tools that enable efficient human-machine collaboration for improving current intelligent systems and enhancing human capabilities while solving complex tasks. My current projects focus on the problem of troubleshooting complex machine learning systems for helping system designers to better understand and debug their systems. In the same vein, I am also involved in various research initiatives and projects that study the societal impact of artificial intelligence as well as various quality-of-service aspects of AI including interpretability, transparency, accountability, and fairness.

Prior to joining MSR AI, in 2016 I completed my PhD degree at ETH Zurich (Switzerland) in the Systems Group, supervised by Prof. Donald Kossmann and Prof. Andreas Krause. My doctoral thesis focuses on integrating crowdsourcing in the process of building machine learning algorithms and systems. It studies how human supervision can be efficiently leveraged for generating training label data for new machine learning models and algorithms. At the same time, it explores the impact of human intervention for assisting machine learning experts in troubleshooting and improving existing integrative systems composed of multiple machine learning components.

In 2011, I completed my master studies in computer science in a double-degree MSc program at RWTH University of Aachen (Germany) and University of Trento (Italy) as an Erasmus Mundus scholar. I also have a Diploma in Informatics from University of Tirana (Albania) from where I graduated in 2007.

Intelligent systems are nowadays becoming ubiquitous not only in industry but also in the everyday life of people. Such systems can already handle a large spectrum of challenges and assist humans in task automation, planning, and decision-making. However, the current applications lack the ability to understand, diagnose, and fix their own mistakes which consequently reduces users' trust. The ambitious goal of this project is to improve the credibility of intelligent systems by introducing the human intelligence in the loop.

Human-in-the-loop troubleshooting
Troubleshooting Integrative AI systems with humans in the loop

Collaborators: Ece Kamar (Microsoft Research), Lydia Manikonda (Arizona State University), Eric Horvitz (Microsoft Research), Donald Kossmann

Quality assurance is one the most important challenges in crowdsourcing. Assigning tasks to several workers to increase quality through redundant answers can be expensive if asking homogeneous sources. In this project, we look at various crowd access optimization techniques that can be applied either while building training models with crowdsourced data or while applying such models to make crowdsourced predictions.

In the context of crowdsourced predictions, our work argues that optimization needs to be aware of diversity and correlation of information within groups of individuals so that crowdsourcing redundancy can be adequately planned beforehand. Based on this intuitive idea, we introduce the Access Path Model (APM), a novel crowd model that leverages the notion of access paths as an alternative way of retrieving information. The access path configuration can be based on various criteria depending on the task: (i) workers’ demographics (e.g. profession, group of interest, age) (ii) the source of information or the tool that is used to find the answer (e.g. phone call vs. web page, Bing vs. Google) (iii) task design (e.g. time of completion, user interface) (iv) task decomposition (e.g. part of the answers, features). APM aggregates answers ensuring high quality and meaningful confidence. Moreover, we devise a greedy optimization algorithm for this model that finds a provably good approximate plan to access the crowd.

Access Path Model
The Access Path Model applied on Medical Questions and Answers

In addition, we have devised various techniques for building machine learning models with crowdsourced feature input under budget constraints. The main challenge that we are trying to solve is related to the natural exploration and exploitation trade-offs in crowdsourcing between noisy redundancy and the number of observed examples.
Collaborators: Anja Gruenheid, Adish Singla, Erfan Zamanian, Andreas Krause, Donald Kossmann

The goal of this project is to develop a set of novel techniques that allow to integrate human resources into a database system in order to process some of the impossible queries that Google and Oracle cannot answer today and address some of the notoriously hard database research problems in a very different way as has been done in the past. Specifically, CrowdDB extends a relational database system and is able to process both conventional and crowdsourced data. For this purpose, we have designed and implemented various algorithms for quality management and query processing. Moreover, the project has been focused on implementing crowdsourced query operators for entity resolution, joins, comparisons, and sorting.

CrowdDB Architecture
CrowdDB Architecture

Collaborators: Anja Gruenheid, Donald Kossmann, Lynn Aders, Erfan Zamanian

The online communities available on the Web have shown to be highly interactive and capable of collectively solving difficult tasks. CrowdSTAR is a framework designed to route tasks across and within online crowds. This system indexes the topic-specific expertise and social features of the crowd contributors and then uses a routing algorithm, which suggests the best sources to ask based on the knowledge vs. social availability trade-offs. CrowdSTAR is integrated with two popular social networks as crowd candidates: Twitter and Quora.

User utility model in CrowdSTAR
The user utility model in CrowdSTAR accounts for both expertise and social availability

CrowdSTAR Architecture
CrowdSTAR Architecture

Collaborators: Omar Alonso (Microsoft), Vasilis Kandylas (Microsoft), Martin Hentschel (Microsoft)


Towards Accountable AI: Hybrid Human-Machine Analyses for Characterizing System Failure. Besmira Nushi, Ece Kamar, Eric Horvitz; HCOMP 2018. pdf

On Human Intellect and Machine Failures: Troubleshooting Integrative Machine Learning Systems. Besmira Nushi, Ece Kamar, Eric Horvitz, Donald Kossmann; AAAI 2017. pdf

Quality Control and Optimization for Hybrid Crowd-Machine Learning Systems. Besmira Nushi; ETH PhD Thesis 2016. pdf

Learning and Feature Selection under Budget Constraints in Crowdsourcing. Besmira Nushi, Adish Singla, Andreas Krause, Donald Kossmann; HCOMP 2016. pdf

Fault-Tolerant Entity Resolution with the Crowd. Anja Gruenheid, Besmira Nushi, Tim Kraska, Wolfgang Gatterbauer, Donald Kossmann; arXiv 2016. full technical report

Crowd Access Path Optimization: Diversity Matters. Besmira Nushi, Adish Singla, Anja Gruenheid, Erfan Zamanian, Andreas Krause, Donald Kossmann; HCOMP 2015. pdf

CrowdSTAR: A Social Task Routing Framework for Online Communities. Besmira Nushi, Omar Alonso, Martin Hentschel, and Vasileios Kandylas; ICWE 2015. pdf full technical report

When is A = B? Anja Gruenheid, Donald Kossmann, Besmira Nushi, Yuri Gurevich; EATCS Bulletin 111 (2013) pdf

Uncertain time-series similarity: Return to the basics. Michele Dallachiesa, Besmira Nushi, Katsiaryna Mirylenka, and Themis Palpanas; Proceedings of the VLDB Endowment 5, no. 11 (2012): 1662-1673. pdf

Similarity matching for uncertain time series: analytical and experimental comparison. Michele Dallachiesa, Besmira Nushi, Katsiaryna Mirylenka, and Themis Palpanas. Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Querying and Mining Uncertain Spatio-Temporal Data, pp. 8-15. ACM, 2011. pdf

Computational Intelligence Lab, Spring 2016, ETH Zurich
Machine Learning, Autumn 2015, ETH Zurich 
Computational Intelligence Lab, Spring 2015, ETH Zurich
Big Data, Fall 2014, ETH Zurich
Data Modelling and Databases, Spring 2014, ETH Zurich
Data Modelling and Databases, Spring 2013, ETH Zurich
Big Data, Fall 2012, ETH Zurich
Software Engineering, UML, C++ Programming, University of Tirana

Microsoft Research AI (MSR AI) is a new organization that brings together the breadth of talent across Microsoft Research to pursue game-changing advances in artificial intelligence. The new research and development initiative combines advances in machine learning with innovations in language and dialog, human computer interaction, and computer vision to solve some of the toughest challenges in AI. A key focus for this initiative is to probe the foundational principles of intelligence, including efforts to unravel the mysteries of human intellect, and use this knowledge to develop a more general, flexible artificial intelligence. MSR AI pursues use of machine intelligence in new ways to empower people and organizations, including systems that deliver new experiences and capabilities that help people be more efficient, engaged and productive.

On machine learning and artificial intelligence
Talking Machines Podcast Human conversation about machine learning
The master algorithm by Pedro Domingos: Talk | Book
The Long-Term Future of (Artificial) Intelligence by Stuart Russell: Talk
Artificial Intelligence: A Modern Approach: Book

On collective intelligence and crowdsourcing
Handbook of Human Computation: Book
HCOMP - AAAI Conference on Human Computation and Crowdsourcing
CROWDML - Workshop on Crowdsourcing and Machine Learning
Rich Data Summit - Conference on Data Science and Crowdsourcing
Wisdom of the Crowds by James Surowiecki: TED talk | Book
The Global Brain by Peter Russell: Book


Microsoft building 99 (3117)
14820 NE 36th St, Redmond, WA 98052, USA