Skip to content

Commit 0ee64f1

Browse files
authored
.Net: feat: Modernize GoogleTextSearch connector with ITextSearch<TRecord> interface (#10456) (#13190)
# Modernize GoogleTextSearch connector with ITextSearch interface ## Problem Statement The GoogleTextSearch connector currently only implements the legacy ITextSearch interface, forcing users to use clause-based TextSearchFilter instead of modern type-safe LINQ expressions. This creates runtime errors from property name typos and lacks compile-time validation for Google search operations. ## Technical Approach This PR modernizes the GoogleTextSearch connector to implement the generic ITextSearch<GoogleWebPage> interface alongside the existing legacy interface. The implementation provides LINQ-to-Google-API conversion with support for equality, contains, NOT operations, FileFormat filtering, and compound AND expressions. ### Implementation Details **Core Changes** - Implement ITextSearch<GoogleWebPage> interface with full generic method support - Add LINQ expression analysis supporting equality, contains, NOT operations, and compound AND expressions - Map LINQ expressions to Google Custom Search API parameters (exactTerms, orTerms, excludeTerms, fileType, siteSearch) - Support advanced filtering patterns with type-safe property access **Property Mapping Strategy** The Google Custom Search API supports substantial filtering through predefined parameters: - exactTerms: Exact title/content match - siteSearch: Site/domain filtering - fileType: File extension filtering - excludeTerms: Negation filtering - Additional parameters: country restrict, language, date filtering ### Code Examples **Before (Legacy Interface)** ```csharp var options = new TextSearchOptions { Filter = new TextSearchFilter().Equality("siteSearch", "microsoft.com") }; ``` **After (Generic Interface)** ```csharp // Simple filtering var options = new TextSearchOptions<GoogleWebPage> { Filter = page => page.DisplayLink.Contains("microsoft.com") }; // Complex filtering var complexOptions = new TextSearchOptions<GoogleWebPage> { Filter = page => page.DisplayLink.Contains("microsoft.com") && page.Title.Contains("AI") && page.FileFormat == "pdf" && !page.Snippet.Contains("deprecated") }; ``` ## Implementation Benefits ### Type Safety & Developer Experience - Compile-time validation of GoogleWebPage property access - IntelliSense support for all GoogleWebPage properties - Eliminates runtime errors from property name typos in filters ### Enhanced Filtering Capabilities - Equality filtering: page.Property == "value" - Contains filtering: page.Property.Contains("text") - NOT operations: !page.Property.Contains("text") - FileFormat filtering: page.FileFormat == "pdf" - Compound AND expressions with multiple conditions ## Validation Results **Build Verification** - Command: `dotnet build --configuration Release --interactive` - Result: Build succeeded in 3451.8s (57.5 minutes) - all projects compiled successfully - Status: ✅ PASSED (0 errors, 0 warnings) **Test Results** **Full Test Suite:** - Passed: 7,177 (core functionality tests) - Failed: 2,421 (external API configuration issues) - Skipped: 31 - Duration: 4 minutes 57 seconds **Core Unit Tests:** - Semantic Kernel unit tests: 1,574/1,574 tests passed (100%) - Google Connector Tests: 29 tests passed (23 legacy + 6 generic) **Test Failure Analysis** The **2,421 test failures** are infrastructure/configuration issues, **not code defects**: - **Azure OpenAI API Configuration**: Missing API keys for external service integration tests - **AWS Bedrock Configuration**: Integration tests requiring live AWS services - **Docker Dependencies**: Vector database containers not available in development environment - **External Service Dependencies**: Integration tests requiring live API services (Bing, Google, etc.) These failures are **expected in development environments** without external API configurations. **Method Ambiguity Resolution** Fixed compilation issues when both legacy and generic interfaces are implemented: ```csharp // Before (ambiguous): await textSearch.SearchAsync("query", new() { Top = 4, Skip = 0 }); // After (explicit): await textSearch.SearchAsync("query", new TextSearchOptions { Top = 4, Skip = 0 }); ``` ## Files Modified ``` dotnet/src/Plugins/Plugins.Web/Google/GoogleWebPage.cs (NEW) dotnet/src/Plugins/Plugins.Web/Google/GoogleTextSearch.cs (MODIFIED) dotnet/samples/Concepts/TextSearch/Google_TextSearch.cs (ENHANCED) dotnet/samples/GettingStartedWithTextSearch/Step1_Web_Search.cs (FIXED) ``` ## Breaking Changes None. All existing GoogleTextSearch functionality preserved. Method ambiguity issues resolved through explicit typing. ## Multi-PR Context This is PR 4 of 6 in the structured implementation approach for Issue #10456. This PR extends LINQ filtering support to the GoogleTextSearch connector, following the established pattern from BingTextSearch modernization. --------- Co-authored-by: Alexander Zarei <[email protected]>
1 parent 639e7cb commit 0ee64f1

File tree

5 files changed

+1109
-20
lines changed

5 files changed

+1109
-20
lines changed

dotnet/samples/Concepts/Search/Google_TextSearch.cs

Lines changed: 111 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ public async Task UsingGoogleTextSearchAsync()
2626
var query = "What is the Semantic Kernel?";
2727

2828
// Search and return results as string items
29-
KernelSearchResults<string> stringResults = await textSearch.SearchAsync(query, new() { Top = 4, Skip = 0 });
29+
KernelSearchResults<string> stringResults = await textSearch.SearchAsync(query, new TextSearchOptions { Top = 4, Skip = 0 });
3030
Console.WriteLine("——— String Results ———\n");
3131
await foreach (string result in stringResults.Results)
3232
{
@@ -35,7 +35,7 @@ public async Task UsingGoogleTextSearchAsync()
3535
}
3636

3737
// Search and return results as TextSearchResult items
38-
KernelSearchResults<TextSearchResult> textResults = await textSearch.GetTextSearchResultsAsync(query, new() { Top = 4, Skip = 4 });
38+
KernelSearchResults<TextSearchResult> textResults = await textSearch.GetTextSearchResultsAsync(query, new TextSearchOptions { Top = 4, Skip = 4 });
3939
Console.WriteLine("\n——— Text Search Results ———\n");
4040
await foreach (TextSearchResult result in textResults.Results)
4141
{
@@ -46,7 +46,7 @@ public async Task UsingGoogleTextSearchAsync()
4646
}
4747

4848
// Search and return results as Google.Apis.CustomSearchAPI.v1.Data.Result items
49-
KernelSearchResults<object> fullResults = await textSearch.GetSearchResultsAsync(query, new() { Top = 4, Skip = 8 });
49+
KernelSearchResults<object> fullResults = await textSearch.GetSearchResultsAsync(query, new TextSearchOptions { Top = 4, Skip = 8 });
5050
Console.WriteLine("\n——— Google Web Page Results ———\n");
5151
await foreach (Google.Apis.CustomSearchAPI.v1.Data.Result result in fullResults.Results)
5252
{
@@ -74,7 +74,7 @@ public async Task UsingGoogleTextSearchWithACustomMapperAsync()
7474
var query = "What is the Semantic Kernel?";
7575

7676
// Search with TextSearchResult textResult type
77-
KernelSearchResults<string> stringResults = await textSearch.SearchAsync(query, new() { Top = 2, Skip = 0 });
77+
KernelSearchResults<string> stringResults = await textSearch.SearchAsync(query, new TextSearchOptions { Top = 2, Skip = 0 });
7878
Console.WriteLine("--- Serialized JSON Results ---");
7979
await foreach (string result in stringResults.Results)
8080
{
@@ -107,6 +107,113 @@ public async Task UsingGoogleTextSearchWithASiteSearchFilterAsync()
107107
}
108108
}
109109

110+
/// <summary>
111+
/// Show how to use enhanced LINQ filtering with GoogleTextSearch including Contains, NOT, FileType, and compound AND expressions.
112+
/// </summary>
113+
[Fact]
114+
public async Task UsingGoogleTextSearchWithEnhancedLinqFilteringAsync()
115+
{
116+
// Create an ITextSearch<GoogleWebPage> instance using Google search
117+
var textSearch = new GoogleTextSearch(
118+
initializer: new() { ApiKey = TestConfiguration.Google.ApiKey, HttpClientFactory = new CustomHttpClientFactory(this.Output) },
119+
searchEngineId: TestConfiguration.Google.SearchEngineId);
120+
121+
var query = "Semantic Kernel AI";
122+
123+
// Example 1: Simple equality filtering
124+
Console.WriteLine("——— Example 1: Equality Filter (DisplayLink) ———\n");
125+
var equalityOptions = new TextSearchOptions<GoogleWebPage>
126+
{
127+
Top = 2,
128+
Skip = 0,
129+
Filter = page => page.DisplayLink == "microsoft.com"
130+
};
131+
var equalityResults = await textSearch.SearchAsync(query, equalityOptions);
132+
await foreach (string result in equalityResults.Results)
133+
{
134+
Console.WriteLine(result);
135+
Console.WriteLine(new string('—', HorizontalRuleLength));
136+
}
137+
138+
// Example 2: Contains filtering
139+
Console.WriteLine("\n——— Example 2: Contains Filter (Title) ———\n");
140+
var containsOptions = new TextSearchOptions<GoogleWebPage>
141+
{
142+
Top = 2,
143+
Skip = 0,
144+
Filter = page => page.Title != null && page.Title.Contains("AI")
145+
};
146+
var containsResults = await textSearch.SearchAsync(query, containsOptions);
147+
await foreach (string result in containsResults.Results)
148+
{
149+
Console.WriteLine(result);
150+
Console.WriteLine(new string('—', HorizontalRuleLength));
151+
}
152+
153+
// Example 3: NOT Contains filtering (exclusion)
154+
Console.WriteLine("\n——— Example 3: NOT Contains Filter (Exclude 'deprecated') ———\n");
155+
var notContainsOptions = new TextSearchOptions<GoogleWebPage>
156+
{
157+
Top = 2,
158+
Skip = 0,
159+
Filter = page => page.Title != null && !page.Title.Contains("deprecated")
160+
};
161+
var notContainsResults = await textSearch.SearchAsync(query, notContainsOptions);
162+
await foreach (string result in notContainsResults.Results)
163+
{
164+
Console.WriteLine(result);
165+
Console.WriteLine(new string('—', HorizontalRuleLength));
166+
}
167+
168+
// Example 4: FileFormat filtering
169+
Console.WriteLine("\n——— Example 4: FileFormat Filter (PDF files) ———\n");
170+
var fileFormatOptions = new TextSearchOptions<GoogleWebPage>
171+
{
172+
Top = 2,
173+
Skip = 0,
174+
Filter = page => page.FileFormat == "pdf"
175+
};
176+
var fileFormatResults = await textSearch.SearchAsync(query, fileFormatOptions);
177+
await foreach (string result in fileFormatResults.Results)
178+
{
179+
Console.WriteLine(result);
180+
Console.WriteLine(new string('—', HorizontalRuleLength));
181+
}
182+
183+
// Example 5: Compound AND filtering (multiple conditions)
184+
Console.WriteLine("\n——— Example 5: Compound AND Filter (Title + Site) ———\n");
185+
var compoundOptions = new TextSearchOptions<GoogleWebPage>
186+
{
187+
Top = 2,
188+
Skip = 0,
189+
Filter = page => page.Title != null && page.Title.Contains("Semantic") &&
190+
page.DisplayLink != null && page.DisplayLink.Contains("microsoft")
191+
};
192+
var compoundResults = await textSearch.SearchAsync(query, compoundOptions);
193+
await foreach (string result in compoundResults.Results)
194+
{
195+
Console.WriteLine(result);
196+
Console.WriteLine(new string('—', HorizontalRuleLength));
197+
}
198+
199+
// Example 6: Complex compound filtering (equality + contains + exclusion)
200+
Console.WriteLine("\n——— Example 6: Complex Compound Filter (FileFormat + Contains + NOT Contains) ———\n");
201+
var complexOptions = new TextSearchOptions<GoogleWebPage>
202+
{
203+
Top = 2,
204+
Skip = 0,
205+
Filter = page => page.FileFormat == "pdf" &&
206+
page.Title != null && page.Title.Contains("AI") &&
207+
page.Snippet != null && !page.Snippet.Contains("deprecated")
208+
};
209+
var complexResults = await textSearch.SearchAsync(query, complexOptions);
210+
await foreach (string result in complexResults.Results)
211+
{
212+
Console.WriteLine(result);
213+
Console.WriteLine(new string('—', HorizontalRuleLength));
214+
}
215+
}
216+
110217
#region private
111218
private const int HorizontalRuleLength = 80;
112219

dotnet/samples/GettingStartedWithTextSearch/Step1_Web_Search.cs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ public async Task BingSearchAsync()
2525
var query = "What is the Semantic Kernel?";
2626

2727
// Search and return results
28-
KernelSearchResults<string> searchResults = await textSearch.SearchAsync(query, new() { Top = 4 });
28+
KernelSearchResults<string> searchResults = await textSearch.SearchAsync(query, new TextSearchOptions { Top = 4 });
2929
await foreach (string result in searchResults.Results)
3030
{
3131
Console.WriteLine(result);
@@ -46,7 +46,7 @@ public async Task GoogleSearchAsync()
4646
var query = "What is the Semantic Kernel?";
4747

4848
// Search and return results
49-
KernelSearchResults<string> searchResults = await textSearch.SearchAsync(query, new() { Top = 4 });
49+
KernelSearchResults<string> searchResults = await textSearch.SearchAsync(query, new TextSearchOptions { Top = 4 });
5050
await foreach (string result in searchResults.Results)
5151
{
5252
Console.WriteLine(result);

0 commit comments

Comments
 (0)