Skip to content

Commit 578e172

Browse files
authored
feat: reduce token usage by enable YOLO profile merge (#120)
* feat: add already profile in summary * temp * feat: add merge yolo to reduce tokens * tests: fix merge yolo test * fix: yolo merge parsing * docs: update * docs: update
1 parent b9ce6ce commit 578e172

21 files changed

+764
-278
lines changed

Changelog.md

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,14 @@
1-
### [0.0.39] - unreleased
1+
### [0.0.40] - unreleased
2+
3+
Added:
4+
5+
- Use YOLO profile merge instead of multiple profile merges, reduce tokens cost ~30%
6+
7+
Fixed:
8+
9+
- Randomly Chinese Profile problem
10+
11+
### [0.0.39] - 2025/8/9
212

313
**Added**
414

readme.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -50,14 +50,20 @@
5050

5151

5252

53-
Memobase is a **user profile-based memory system** designed to bring long-term user memory to your Generative AI (GenAI) applications. Whether you're building virtual companions, educational tools, or personalized assistants, Memobase empowers your AI to **remember**, **understand**, and **evolve** with your users.
53+
Memobase is a **user profile-based memory system** designed to bring long-term user memory to your LLM applications. Whether you're building virtual companions, educational tools, or personalized assistants, Memobase empowers your AI to **remember**, **understand**, and **evolve** with your users.
5454

5555

5656

57-
Memobase can provide you structured profiles of users, check out the [result](./docs/experiments/900-chats/readme.md) (compared with [mem0](https://github.com/mem0ai/mem0)) from a 900-turns real-world chatting:
57+
Memobase offers the perfect balance for your product among various memory solutions. At Memobase, we focus on three key metrics simultaneously:
5858

59+
- **Performance**: Although Memobase is not specifically designed for RAG/search tasks, it still achieves top-tier search performance in the LOCOMO benchmark.
60+
- **LLM Cost**: Memobase includes a built-in buffer for each user to batch-process their chats, allowing the overhead to be distributed efficiently. Additionally, we carefully design our prompts and workflows, ensuring there are no "agents" in the system that could lead to excessive costs.
61+
- **Latency**: Memobase works similarly to the memory system behind ChatGPT: for each user, there is always a user profile and event timeline available. This allows you to access the most important memories of a user without any pre-processing, but only few SQL operations, keeping online latency under 100ms.
5962

6063

64+
65+
Check out the profile [result](./docs/experiments/900-chats/readme.md) (compared with [mem0](https://github.com/mem0ai/mem0)) from a 900-turns real-world chatting:
66+
6167
<details>
6268
<summary>Partial Profile Output</summary>
6369

@@ -90,7 +96,7 @@ Memobase can provide you structured profiles of users, check out the [result](./
9096
</details>
9197

9298
## 🎉 Recent Updates
93-
- `0.0.38`: we updated the workflows in Memobase, reducing the insert cost by 30%
99+
- `0.0.40`: we updated the internal workflows in Memobase, reducing the number of LLM calls in a single run from approximately 3-10 times to a fixed 3 times, which reduces token costs by approximately 40-50%. (Consider updating your Memobase version!)
94100
- `0.0.37`: we added fine-grained event gist, enabling the detailed search on users' timeline. [Re-ran the LOCOMO benchmark](./docs/experiments/locomo-benchmark) and we're SOTA!
95101
- `0.0.36`: we updated the search of `context` api, making the search take between 500~1000ms (depending on the embedding API you're using). Also, you can [pass a prompt template](https://docs.memobase.io/api-reference/prompt/get_context#parameter-customize-context-prompt) to the `context` api to pack memories directly into prompt.
96102

src/server/api/memobase_server/controllers/modal/chat/__init__.py

Lines changed: 24 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,14 @@
55
from ....utils import get_blob_str, get_encoded_tokens
66
from ....models.blob import Blob
77
from ....models.utils import Promise, CODE
8-
from ....models.response import IdsData, ChatModalResponse
8+
from ....models.response import IdsData, ChatModalResponse, UserProfilesData
99
from ...profile import add_update_delete_user_profiles
1010
from ...event import append_user_event
11+
from ...profile import get_user_profiles
1112
from .extract import extract_topics
12-
from .merge import merge_or_valid_new_memos
13+
14+
# from .merge import merge_or_valid_new_memos
15+
from .merge_yolo import merge_or_valid_new_memos
1316
from .summary import re_summary
1417
from .organize import organize_profiles
1518
from .types import MergeAddResult
@@ -47,7 +50,14 @@ async def process_blobs(
4750
return p
4851
project_profiles = p.data()
4952

50-
p = await entry_chat_summary(user_id, project_id, blobs, project_profiles)
53+
p = await get_user_profiles(user_id, project_id)
54+
if not p.ok():
55+
return p
56+
current_user_profiles = p.data()
57+
58+
p = await entry_chat_summary(
59+
user_id, project_id, blobs, project_profiles, current_user_profiles
60+
)
5161
if not p.ok():
5262
return p
5363
user_memo_str = p.data().strip()
@@ -63,8 +73,12 @@ async def process_blobs(
6373
)
6474

6575
processing_results = await asyncio.gather(
66-
process_profile_res(user_id, project_id, user_memo_str, project_profiles),
67-
process_event_res(user_id, project_id, user_memo_str, project_profiles),
76+
process_profile_res(
77+
user_id, project_id, user_memo_str, project_profiles, current_user_profiles
78+
),
79+
process_event_res(
80+
user_id, project_id, user_memo_str, project_profiles, current_user_profiles
81+
),
6882
)
6983

7084
profile_results: Promise = processing_results[0]
@@ -109,9 +123,12 @@ async def process_profile_res(
109123
project_id: str,
110124
user_memo_str: str,
111125
project_profiles: ProfileConfig,
126+
current_user_profiles: UserProfilesData,
112127
) -> Promise[tuple[MergeAddResult, list[dict]]]:
113128

114-
p = await extract_topics(user_id, project_id, user_memo_str, project_profiles)
129+
p = await extract_topics(
130+
user_id, project_id, user_memo_str, project_profiles, current_user_profiles
131+
)
115132
if not p.ok():
116133
return p
117134
extracted_data = p.data()
@@ -170,6 +187,7 @@ async def process_event_res(
170187
project_id: str,
171188
memo_str: str,
172189
config: ProfileConfig,
190+
current_user_profiles: UserProfilesData,
173191
) -> Promise[list | None]:
174192
p = await tag_event(project_id, config, memo_str)
175193
if not p.ok():

src/server/api/memobase_server/controllers/modal/chat/entry_summary.py

Lines changed: 19 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,16 +8,25 @@
88
from ....prompts.profile_init_utils import read_out_event_tags
99
from ....prompts.utils import tag_chat_blobs_in_order_xml
1010
from .types import FactResponse, PROMPTS
11+
from ....models.response import UserProfilesData
12+
from .utils import pack_current_user_profiles
1113

1214

1315
async def entry_chat_summary(
14-
user_id: str, project_id: str, blobs: list[Blob], project_profiles: ProfileConfig
16+
user_id: str,
17+
project_id: str,
18+
blobs: list[Blob],
19+
project_profiles: ProfileConfig,
20+
current_user_profiles: UserProfilesData,
1521
) -> Promise[str]:
1622
assert all(b.type == BlobType.chat for b in blobs), "All blobs must be chat blobs"
17-
USE_LANGUAGE = project_profiles.language or CONFIG.language
18-
project_profiles_slots = read_out_profile_config(
19-
project_profiles, PROMPTS[USE_LANGUAGE]["profile"].CANDIDATE_PROFILE_TOPICS
23+
CURRENT_PROFILE_INFO = pack_current_user_profiles(
24+
current_user_profiles, project_profiles
2025
)
26+
27+
USE_LANGUAGE = CURRENT_PROFILE_INFO["use_language"]
28+
project_profiles_slots = CURRENT_PROFILE_INFO["project_profile_slots"]
29+
2130
prompt = PROMPTS[USE_LANGUAGE]["entry_summary"]
2231
event_summary_theme = (
2332
project_profiles.event_theme_requirement or CONFIG.event_theme_requirement
@@ -33,7 +42,7 @@ async def entry_chat_summary(
3342
blob_strs = tag_chat_blobs_in_order_xml(blobs)
3443
r = await llm_complete(
3544
project_id,
36-
prompt.pack_input(blob_strs),
45+
prompt.pack_input(CURRENT_PROFILE_INFO["already_topics_prompt"], blob_strs),
3746
system_prompt=prompt.get_prompt(
3847
profile_topics_str,
3948
event_attriubtes_str,
@@ -43,4 +52,9 @@ async def entry_chat_summary(
4352
model=CONFIG.summary_llm_model,
4453
**prompt.get_kwargs(),
4554
)
55+
56+
# print(
57+
# prompt.pack_input(CURRENT_PROFILE_INFO["already_topics_prompt"], blob_strs),
58+
# r.data(),
59+
# )
4660
return r

src/server/api/memobase_server/controllers/modal/chat/extract.py

Lines changed: 17 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,15 @@
1-
import asyncio
21
from ....env import CONFIG, ContanstTable, TRACE_LOG
3-
from ....utils import truncate_string
42
from ....models.utils import Promise
5-
from ....models.blob import Blob, BlobType
6-
from ....models.response import AIUserProfiles, CODE
3+
from ....models.response import AIUserProfiles, CODE, UserProfilesData
74
from ....llms import llm_complete
85
from ....prompts.utils import (
9-
tag_chat_blobs_in_order_xml,
106
attribute_unify,
117
parse_string_into_profiles,
12-
parse_string_into_merge_action,
138
)
149
from ....prompts.profile_init_utils import read_out_profile_config, UserProfileTopic
15-
from ...profile import get_user_profiles
1610
from ...project import ProfileConfig
17-
18-
# from ...project impor
1911
from .types import FactResponse, PROMPTS
12+
from .utils import pack_current_user_profiles
2013

2114

2215
def merge_by_topic_sub_topics(new_facts: list[FactResponse]):
@@ -31,73 +24,26 @@ def merge_by_topic_sub_topics(new_facts: list[FactResponse]):
3124

3225

3326
async def extract_topics(
34-
user_id: str, project_id: str, user_memo: str, project_profiles: ProfileConfig
27+
user_id: str,
28+
project_id: str,
29+
user_memo: str,
30+
project_profiles: ProfileConfig,
31+
current_user_profiles: UserProfilesData,
3532
) -> Promise[dict]:
36-
p = await get_user_profiles(user_id, project_id)
37-
if not p.ok():
38-
return p
39-
profiles = p.data().profiles
40-
USE_LANGUAGE = project_profiles.language or CONFIG.language
41-
STRICT_MODE = (
42-
project_profiles.profile_strict_mode
43-
if project_profiles.profile_strict_mode is not None
44-
else CONFIG.profile_strict_mode
45-
)
4633

47-
project_profiles_slots = read_out_profile_config(
48-
project_profiles, PROMPTS[USE_LANGUAGE]["profile"].CANDIDATE_PROFILE_TOPICS
34+
profiles = current_user_profiles.profiles
35+
CURRENT_PROFILE_INFO = pack_current_user_profiles(
36+
current_user_profiles, project_profiles
4937
)
50-
if STRICT_MODE:
51-
allowed_topic_subtopics = set()
52-
for p in project_profiles_slots:
53-
for st in p.sub_topics:
54-
allowed_topic_subtopics.add(
55-
(attribute_unify(p.topic), attribute_unify(st["name"]))
56-
)
38+
USE_LANGUAGE = CURRENT_PROFILE_INFO["use_language"]
39+
STRICT_MODE = CURRENT_PROFILE_INFO["strict_mode"]
5740

58-
if len(profiles):
59-
already_topics_subtopics = set(
60-
[
61-
(
62-
attribute_unify(p.attributes[ContanstTable.topic]),
63-
attribute_unify(p.attributes[ContanstTable.sub_topic]),
64-
)
65-
for p in profiles
66-
]
67-
)
68-
already_topic_subtopics_values = {
69-
(
70-
attribute_unify(p.attributes[ContanstTable.topic]),
71-
attribute_unify(p.attributes[ContanstTable.sub_topic]),
72-
): p.content
73-
for p in profiles
74-
}
75-
if STRICT_MODE:
76-
already_topics_subtopics = already_topics_subtopics.intersection(
77-
allowed_topic_subtopics
78-
)
79-
already_topic_subtopics_values = {
80-
k: already_topic_subtopics_values[k] for k in already_topics_subtopics
81-
}
82-
already_topics_subtopics = sorted(already_topics_subtopics)
83-
already_topics_prompt = "\n".join(
84-
[
85-
f"- {topic}{CONFIG.llm_tab_separator}{sub_topic}{CONFIG.llm_tab_separator}{truncate_string(already_topic_subtopics_values[(topic, sub_topic)], 5)}"
86-
for topic, sub_topic in already_topics_subtopics
87-
]
88-
)
89-
TRACE_LOG.info(
90-
project_id,
91-
user_id,
92-
f"Already have {len(profiles)} profiles, {len(already_topics_subtopics)} topics",
93-
)
94-
else:
95-
already_topics_prompt = ""
41+
project_profiles_slots = CURRENT_PROFILE_INFO["project_profile_slots"]
9642

9743
p = await llm_complete(
9844
project_id,
9945
PROMPTS[USE_LANGUAGE]["extract"].pack_input(
100-
already_topics_prompt,
46+
CURRENT_PROFILE_INFO["already_topics_prompt"],
10147
user_memo,
10248
strict_mode=STRICT_MODE,
10349
),
@@ -112,7 +58,7 @@ async def extract_topics(
11258
results = p.data()
11359
# print(
11460
# PROMPTS[USE_LANGUAGE]["extract"].pack_input(
115-
# already_topics_prompt,
61+
# CURRENT_PROFILE_INFO["already_topics_prompt"],
11662
# user_memo,
11763
# strict_mode=STRICT_MODE,
11864
# )
@@ -145,11 +91,11 @@ async def extract_topics(
14591
fact_attributes = []
14692

14793
for nf in new_facts:
148-
if STRICT_MODE:
94+
if CURRENT_PROFILE_INFO["allowed_topic_subtopics"] is not None:
14995
if (
15096
nf[ContanstTable.topic],
15197
nf[ContanstTable.sub_topic],
152-
) not in allowed_topic_subtopics:
98+
) not in CURRENT_PROFILE_INFO["allowed_topic_subtopics"]:
15399
continue
154100
fact_contents.append(nf["memo"])
155101
fact_attributes.append(

src/server/api/memobase_server/controllers/modal/chat/merge.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -182,17 +182,18 @@ async def handle_profile_merge_or_valid(
182182
}
183183
)
184184
elif update_response["action"] == "ABORT":
185+
oneline_response = r.data().replace("\n", " ")
185186
if runtime_profile is None:
186187
TRACE_LOG.info(
187188
project_id,
188189
user_id,
189-
f"Invalid profile: {KEY}::{profile_content}, abort it\n<raw_response>\n{r.data()}\n</raw_response>",
190+
f"Invalid profile: {KEY}::{profile_content}. <raw_response> {oneline_response} </raw_response>",
190191
)
191192
else:
192193
TRACE_LOG.info(
193194
project_id,
194195
user_id,
195-
f"Invalid merge: {runtime_profile.attributes}, {profile_content}, abort it\n<raw_response>\n{r.data()}\n</raw_response>",
196+
f"Invalid merge: {runtime_profile.attributes}, {profile_content}. <raw_response> {oneline_response} </raw_response>",
196197
)
197198
# session_merge_validate_results["delete"].append(runtime_profile.id)
198199
return Promise.resolve(None)

0 commit comments

Comments
 (0)