Skip to content

Conversation

@eavanvalkenburg
Copy link
Member

Motivation and Context

ADR proposing changes to the thread setup in Python.

Description

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Copilot AI review requested due to automatic review settings November 24, 2025 12:52
@markwallace-microsoft markwallace-microsoft added the documentation Improvements or additions to documentation label Nov 24, 2025
Copilot finished reviewing on behalf of eavanvalkenburg November 24, 2025 12:55
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces an Architecture Decision Record (ADR) proposing changes to the thread management system in the Python agent framework. The ADR addresses confusion around the current AgentThread class, which can exist in multiple states and hold different types of data, making it difficult for users to understand and use correctly.

Key Changes

  • Proposes two options for restructuring thread management: maintaining the current single-class approach vs. separating into distinct RemoteThread and LocalThread classes
  • Documents the current implementation's limitations and the decision drivers focused on ease of use and clarity
  • Presents trade-offs between consistency with .NET implementation and Python-specific design patterns

- We would then add a flag on ChatClients, to indicate which type of thread they support, and it can be both, so two flags are likely needed, although local thread might always be possible.
- And finally, all Agents would get two methods, `get_service_thread()`/`get_remote_thread` and `get_local_thread()`, both of which might raise an error if the chat client does not support that type of thread.
- the `run` methods would take both types of threads, but would raise an error if the thread type is not supported by the chat client. And it would also check with the `store` parameter to make sure it is used correctly, or set it correctly.
- One open question is how to handle when there is a mismatch between the thread type and the `store` parameter, for example passing a `LocalAgentThread` with `store=True`, or a `ServiceAgentThread` with `store=False`. Options are to either raise an error, or to ignore the `store` parameter and always do the right thing based on the thread type. Or to transform the thread into the right type, but that seems more complex and might not always be possible. Although starting with a local thread (which would be a list of chat messages in a ChatMessageStore) and then setting store=True might make sense, the return would be a service thread then, but that adds complexity, this might be useful for workflows that combine different agent types.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the fact that it's explicit, but this also makes it more complicated with store parameter, because at the moment you only need to change this parameter value True/False, while with a separate thread type you would also need to change that thread type, so basically configuring the same logic in two different places.

If we introduce a separate thread type for remote and local messages, maybe we can remove store parameter in this case. Instead of using store parameter, we will just check a thread type, and by default it will be a local one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, we do need to look at both conversation_id and store parameters in this case. I'll add

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only challenge is that chat options work at the chat client level, while this works at the agent level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added more clarity on this choice, and the option I think makes most sense.

- Good, because it is clear which chat clients support which type of thread.
- Good, because we can make all the logic that deals with threads much clearer, as each class has a single responsibility.
- Good, because it might also enable a abstracted method to get a list of chat messages from a thread through the chat client.
- Bad, because it would more fundamentally diverge from dotnet.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if this API brings more clarity, it can be introduced in .NET as well. Otherwise, I'm not sure if we want to diverge from .NET this much.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When discussing with Weslie we also came to the conclusion that we might invert the order, and do this for python and follow with .net

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main problem in .net is that we don't know what kind of thread is supported by the underlying service until we get the first response from the chat client and changing this will be challenging.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main problem in .net is that we don't know what kind of thread is supported by the underlying service until we get the first response from the chat client and changing this will be challenging.

@eavanvalkenburg Isn't it the same in Python?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, but since currently all the clients are built by us and part of AF, it's a onetime breaking change to add the flag to the protocol and base_client and then we are good to go, and if someone implements their own, it's a simple addition. For dotnet, this is a bit more tricky

- Bad, because it is unclear which chat clients can support which type of thread.
- Bad, because dotnet also has subclasses for each type of agent, so already somewhat diverging from dotnet.

### 2. Separate classes for `ServiceThread`/`RemoteThread` and `LocalAgentThread`/`LocalThread`, each with their own behaviors and methods.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the explicit-ness of the separate classes. I think doing it this way will be better for developers, too, as they'd know which type they're working with and what's available in each object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants