Part 2 – Microsoft 365 Copilot under the hood

Reading Time: 7 minutes


In part one of this blog series, I provided a short introduction to Microsoft 365 Copilot. Here is the link to that post if you missed it, Save time and be more productive at work with Microsoft 365 Copilot – Part 1

In part 2 of this blog series I explore how Microsoft 365 Copilot works under the hood.

After being assigned a Microsoft 365 Copilot license, the Copilot icon will appear in the different Microsoft 365 Apps. We will showcase and demonstrate Copilot in several of these Apps in a future post.


Step 1

Andrew Doe, a Manager, returns from holiday to find a lengthy email discussion including a few attachments about a new office location project. Upon opening the email, he asks Copilot to summarise the conversation and identify any actions assigned to him which he needs to be aware of. As a user when we ask Copilot to do something, such as summarise an email or drafting an email, this is known as a prompt. More on prompts later.

The Summarise button will summarise the email conversation with the email thread.

However, if you wish to ask Copilot to check for any outstanding tasks in the last couple of weeks, there is a Copilot button which works across Outlook as a whole instead of focusing on one email thread. See image below.


Step 2

The Copilot orchestration engine receives the prompt from Andrew Doe’s Outlook application.


Step 3

The Copilot orchestration engine undergoes a task known as post-processing or grounding, during which it accesses Microsoft Graph and Semantic search. Microsoft Graph is basically your Microsoft 365 data, such as your calendar, SharePoint, OneDrive files, meetings, chats, and more. Additionally, Copilot can search other services using plugins and connectors, such as a Bing search plugin that allows access to internet content or third party applications such as ServiceNow. This grounding/post-processing step enhances the quality of the prompt, ensuring you receive relevant answers.



What is Semantic search?
The semantic index is a new feature of Microsoft 365 search that uses the Microsoft Graph to better interact with your personal and organisational data. Relevant information is obtained in the Microsoft Graph and semantic index to provide the Large Language Model (LLM) with more information to reason over. As an example, suppose you want Microsoft Copilot to locate an email where a colleague praised the design work of a vendor. Semantic index includes nearby words (for example, elated, excited, amazed) into the search to broaden the search area and give the best result. All of this work takes place behind the scenes to add relevance to results that you search for with Microsoft Copilot. Another example of Semantic search, it’s like a librarian who not only knows every book in the library but also understands the story behind your question. Traditional search is like looking for books with a specific title, while semantic search finds books by understanding the story you’re really interested in, even if the title is slightly off.



Step 4

The Copilot orchestration engine combines the user data retrieved from graph and Semantic search and sends the modified prompt to the Large Language Model (LLM).



What is a Large Language Model (LLM)?

There is a lot more to LLMs but to simplify, so this post is in a no way a deep dive into LLMs. Here are a few points about LLMs.

Large language models (LLMs) represent a class of artificial intelligence models that specialise in understanding and generating human like text. In the context of Microsoft 365 Copilot, LLMs are the engine that drives Microsoft 365 Copilot capabilities. You may have heard/read about the company OpenAI who developed the popular ChatGPT service. Microsoft have invested billions of dollars into OpenAI and the LLMs they develop. The ChatGPT models are utilised by Microsoft, however, Microsoft privately hosts these models on the Microsoft’s Azure OpenAI Service, so your company data is not shared with OpenAI. Microsoft aims to push the boundaries of AI research and development. By partnering with OpenAI, they can leverage cutting edge AI technologies and innovations. This collaboration is seen as a way to accelerate AI breakthroughs and ensure these benefits are broadly shared with the world.

A few points about LLMs below.

1. LLMs are used to understand user inputs and generate relevant responses.

2. LLMs allow computers to understand and generate language.

3. LLMs specialise in understanding and generating human like text.

4. Operate as generative AI, producing new content and can have a real conversation mimicking human behaviour. It can be difficult to tell whether you’re having a conversation with a human or a machine.

5. Provides the engine that drives Copilot capabilities. The LLM is what provides a response to our prompts/instructions we send it.

6. Instead of merely predicting or classifying, generative AI, like LLMs, can produce entirely new content.

LLM’s are trained using a large amount of data sourced from the Internet, books, conversations, movies and a lot more. An LLM can be used for all sorts of tasks including chat, translation, summarisation, brain storming, writing poems, code generation, writing a book, troubleshooting, writing a FAQ, image creation/detection and a lot more.

That’s where the name Large Language Model comes from. A large amount of work goes into training the LLM. In simple terms, a LLM is a super intelligent auto complete so if we input Roses are _______. The LLM will respond with the next word of Red. At the time of training these models, the LLM will make errors and is then trained/corrected. For example, if an LLM responded with Roses are Green, the team of data analysts would retrain the LLM with the correct answer and this process continues as the LLM fine tunes itself and gets better.

We can compare an LLM to how neurons/brain cells work in the human brain. In the human brain there are some 80 – 100 billion neurons with 100 trillion connections to each other. The brain is structured so that each neuron is connected to thousands of other cells. Human brain cells form a very complex and highly interconnected network which send electrical signals to each other to allow us humans to process information.


Let’s take an example of a toddler/baby who is shown a picture of a dog. At first the baby will make mistakes when learning to identify the differences between animals. When a baby incorrectly identifies a dog as a cat, a parent or teacher may correct the toddler and the more practice the baby gets overtime by viewing pictures of different animals or seeing animals in the real world, the neurons in the brain adjust, allowing the toddler to get better at identifying animals correctly.

Data scientists created LLM’s in a similar way to how the brain works. An LLM is like a human brain made up of a neural network, each neuron is connected to the others. As mentioned earlier, the LLM is pre-trained on a large amount of data. For example, an LLM can be provided with pictures of thousands or even millions of pictures of a dog and then is trained on how to identify the correct one. When an LLM makes a mistake in identifying an animal, it is corrected and the neural networks start to adjust and this process continues as the LLM learns. Similar to the way we learn as humans.

In the diagram below each circle below represents a neuron. When we provide an input we expect to receive an output. Under the hood we have the hidden layer where all the processing takes place before we are provide with the result, known as the output. Simply put, as we make mistakes and learn, neurons are activated/deactivated.


Scientists discovered that the neural network within a large language model (LLM) can be structured to allow neurons to loop back into previous layers, enabling two way communication similar to human neurons. This breakthrough led to more complex behaviors in LLMs, culminating in the development of ChatGPT by the company OpenAI, which can engage in human like conversations. Microsoft invested in OpenAI and use the LLMs in their products. As you’ll appreciate, there is a lot more to this topic and the information I have provided is basic, but I hope this provides you with a simple overview.


Step 5

The LLM (Large Language Model) retrieves the prompt from the Copilot orchestration engine and generates a response. It then returns the response to the Copilot orchestration engine.


Step 6

Copilot takes the response from the LLM and post-processes it. The post-processing involves aditional grounding calls to graph, security, compliance, privacy and responsible AI checks. This is a final check before it is safe to forward the generated response from the LLM to the user Andrew.


Final Diagram


Stay tuned for the next post where I will explore Copilot in several 365 Apps