A powerful, easy-to-use fulltext search engine for Doctrine entities with automatic relevance scoring, query normalization, and machine learning-powered suggestions.
- Define entity and column mappings with simple configuration
- Automatic relevance scoring and result sorting
- Built-in "Did you mean?" suggestions using analytics
- Query normalization with stopword filtering
- Support for entity relationships and custom getters
- Nette Framework integration via DIC extension
- Zero Configuration Start: Define your entity map and start searching immediately
- Intelligent Scoring: Results are automatically scored and sorted by relevance (0-512 points)
- Query Normalization: Automatic stopword removal, duplicate filtering, and query sanitization
- Relationship Support: Search across related entities using dot notation
- Analytics-Powered: Machine learning suggestions based on search history
- Extensible Architecture: Override query normalizer and score calculator via interfaces
- Performance Optimized: PARTIAL selection for efficient database queries with configurable timeout
The package follows a modular architecture with clear separation of concerns:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Search β
β (Main Entry Point) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββΌββββββββββββββββ
βΌ βΌ βΌ
ββββββββββββββββ βββββββββββββββββ ββββββββββββββββ
β Container β βSelectorBuilderβ βEntityMapNorm.β
β (Services) β β (Fluent API) β β (Validation) β
ββββββββββββββββ βββββββββββββββββ ββββββββββββββββ
β
ββββββββββββββΌβββββββββββββ¬βββββββββββββββ
βΌ βΌ βΌ βΌ
ββββββββββ ββββββββββββ βββββββββββββ βββββββββββββ
β Core β βAnalytics β β Query β β Score β
β(Search)β β(Did you β βNormalizer β βCalculator β
β β β mean?) β β β β β
ββββββββββ ββββββββββββ βββββββββββββ βββββββββββββ
β
βΌ
ββββββββββββββββ
β QueryBuilder β
β (DQL) β
ββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SearchResult β
β (Contains SearchItem[] with scoring) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Component | Purpose |
|---|---|
| Search | Main entry point, orchestrates the search process |
| SelectorBuilder | Fluent API for building search queries with type validation |
| Container | Service container holding all dependencies (PSR-11 compatible) |
| Core | Internal search logic, processes candidate results |
| QueryBuilder | Builds DQL queries with JOIN support for relations |
| Analytics | Stores search statistics, powers "Did you mean?" feature |
| QueryNormalizer | Normalizes queries, removes stopwords |
| ScoreCalculator | Calculates relevance scores with year boost |
| SearchResult | Collection of results implementing Iterator |
| SearchItem | Single search result with entity, title, snippet, and score |
It's best to use Composer for installation, and you can also find the package on Packagist and GitHub.
To install, simply use the command:
$ composer require baraja-core/doctrine-fulltext-search- PHP 8.0 or higher
- ext-mbstring
- Doctrine ORM 2.9+
Register the DIC extension in your NEON configuration:
extensions:
doctrineFulltextSearch: Baraja\Search\DoctrineFulltextSearchExtensionThe extension automatically registers:
SearchserviceQueryNormalizerserviceScoreCalculatorserviceSearchAccessoraccessorQueryBuilderservice
You can create an instance of Search manually:
use Baraja\Search\Search;
use Doctrine\ORM\EntityManagerInterface;
$search = new Search($entityManager);With custom normalizer and score calculator:
$search = new Search(
em: $entityManager,
queryNormalizer: new CustomQueryNormalizer(),
scoreCalculator: new CustomScoreCalculator(),
);The simplest way to perform a search is by defining an entity map:
$results = $search->search($query, [
Article::class => [':title', 'description', 'content'],
User::class => ':username',
Product::class => [':name', 'sku', '!internalCode'],
]);
echo $results; // Uses built-in HTML rendererFor better type safety and IDE autocompletion, use the SelectorBuilder:
$results = $search->selectorBuilder($query)
->addEntity(Article::class)
->addColumnTitle('title')
->addColumn('description')
->addColumn('content')
->addEntity(User::class)
->addColumnTitle('username')
->addEntity(Product::class)
->addColumnTitle('name')
->addColumn('sku')
->addColumnSearchOnly('internalCode')
->search();Filter results with custom conditions:
$results = $search->selectorBuilder($query)
->addEntity(Article::class)
->addColumnTitle('title')
->addColumn('content')
->addWhere('active = TRUE')
->addWhere('publishedAt <= NOW()')
->search();Column names support special prefixes that control how they're used in search:
| Modifier | Syntax | Description |
|---|---|---|
| Title | :column |
Used as result caption, displayed even without match |
| Search Only | !column |
Searched but excluded from snippet output |
| Select Only | _column |
Loaded but not searched or included in snippet |
| Normal | column |
Searched and included in snippet |
$entityMap = [
Article::class => [
':title', // Title column - always shown
'description', // Normal - searched and in snippet
'!slug', // Search only - searched but not in snippet
'_authorId', // Select only - loaded but not searched
],
];Using SelectorBuilder:
$search->selectorBuilder($query)
->addEntity(Article::class)
->addColumnTitle('title') // :title
->addColumn('description') // description
->addColumnSearchOnly('slug') // !slug
->addColumnSelectOnly('authorId') // _authorId
->search();Search across related entities using dot notation:
$entityMap = [
Article::class => [
':title',
'author.name', // ManyToOne: Article -> Author
'categories.name', // ManyToMany: Article -> Categories
'content.versions.text', // Deep relation chain
],
];When the getter method differs from the column name:
$entityMap = [
Article::class => [
'versions(content)', // Joins 'versions' but calls getContent()
],
];Wrap phrases in quotes for exact matching:
$query = '"to be or not to be"';
// Finds exact phraseExclude words with minus prefix:
$query = 'linux -ubuntu';
// Finds "linux" but excludes results containing "ubuntu"Search for number ranges:
$query = 'conference 2020..2024';
// Finds results containing years 2020, 2021, 2022, 2023, or 2024The search() method returns a SearchResult entity implementing Iterator:
$results = $search->search($query, $entityMap);
// Total count
$count = $results->getCountResults();
// Search time in milliseconds
$time = $results->getSearchTime();
// "Did you mean?" suggestion
$suggestion = $results->getDidYouMean();
// Iterate results
foreach ($results as $item) {
echo $item->getTitle();
}// Get first 10 results
$items = $results->getItems();
// With pagination
$items = $results->getItems(limit: 20, offset: 40);
// Filter by entity type
$articles = $results->getItemsOfType(Article::class, limit: 10);
// Get only IDs
$ids = $results->getIds(limit: 100);Each result is a SearchItem with these methods:
| Method | Return Type | Description |
|---|---|---|
getId() |
string|int |
Entity identifier |
getEntity() |
object |
Original Doctrine entity (PARTIAL loaded) |
getTitle() |
?string |
Normalized title |
getTitleHighlighted() |
?string |
Title with <i class="highlight"> tags |
getSnippet() |
string |
Best matching text snippet |
getSnippetHighlighted() |
string |
Snippet with highlighted words |
getScore() |
int |
Relevance score (0-512) |
entityToArray() |
array |
Entity as normalized array |
For rapid prototyping, SearchResult implements __toString():
echo $results;This outputs styled HTML with:
- Result count and search time
- "Did you mean?" suggestion (if available)
- Results with highlighted titles and snippets
Add ?debugMode=1 to URL to see scores in output.
When search returns few or no results, the engine can suggest alternative queries:
$results = $search->search('programing', $entityMap);
if ($results->getCountResults() === 0) {
$suggestion = $results->getDidYouMean();
if ($suggestion !== null) {
echo "Did you mean: $suggestion?"; // "programming"
}
}- Every search query and result count is stored in the
search__search_querytable - Queries are scored based on frequency and result count
- When needed, the system finds similar queries using Levenshtein distance
- The best match is suggested based on combined scoring
Disable analytics for specific searches:
$results = $search->search($query, $entityMap, useAnalytics: false);
// Or with SelectorBuilder
$results = $search->selectorBuilder($query)
->addEntity(Article::class)
->addColumnTitle('title')
->search(useAnalytics: false);Results are scored on a scale of 0-512 points based on multiple factors:
| Factor | Points | Description |
|---|---|---|
| Exact match | +32 | Haystack equals query exactly |
| Contains query | +4 | Query found as substring |
| Substring count | +1-3 | Bonus per occurrence (max 3) |
| Word match | +1-4 | Per word occurrence (max 4) |
| Empty content | -16 | Penalty for empty fields |
| Search-only column | -4 | Reduced weight for ! columns |
| Title column | x6-10 | Multiplier for : columns |
| Year boost | x1-6 | Bonus for current/recent years |
The score calculator automatically boosts results containing recent years:
- Current year and adjacent years receive higher scores
- Particularly relevant for news, events, and time-sensitive content
Implement IScoreCalculator for custom scoring:
use Baraja\Search\ScoreCalculator\IScoreCalculator;
class CustomScoreCalculator implements IScoreCalculator
{
public function process(string $haystack, string $query, string $mode = null): int
{
// Your custom scoring logic
return $score;
}
}Register in Nette DI:
services:
- CustomScoreCalculatorThe container will automatically use your implementation.
Queries are automatically normalized before processing:
- Whitespace normalization: Multiple spaces reduced to single
- Length limit: Truncated to 255 characters
- Stopword removal: Common words filtered (in, it, a, the, of, or, etc.)
- Duplicate removal: Repeated words kept only once
- Special character handling:
%,_,{,}converted or removed - Hash removal:
#123becomes123
Implement IQueryNormalizer for project-specific normalization:
use Baraja\Search\QueryNormalizer\IQueryNormalizer;
class CustomQueryNormalizer implements IQueryNormalizer
{
public function normalize(string $query): string
{
// Your normalization logic
return $normalizedQuery;
}
}Configure maximum search time (default: 2500ms):
$container = new Container(
entityManager: $em,
searchTimeout: 5000, // 5 seconds
);
$search = new Search($em, container: $container);Disable "Did you mean?" suggestions:
$results = $search->search(
query: $query,
entityMap: $entityMap,
searchExactly: true,
);Add WHERE conditions to all entity queries:
$results = $search->search(
query: $query,
entityMap: $entityMap,
userConditions: [
'e.active = TRUE',
'e.deletedAt IS NULL',
],
);The package creates one database table for analytics:
Table: search__search_query
| Column | Type | Description |
|---|---|---|
| id | UUID | Primary key |
| query | string | Normalized search query (unique) |
| frequency | int | Number of times searched |
| results | int | Last result count |
| score | int | Calculated relevance (0-100) |
| insertedDate | datetime | First search time |
| updatedDate | datetime | Last search time |
The table is automatically created when using Doctrine migrations with the package's entity mappings.
The default highlighter wraps matched words in:
<i class="highlight">matched word</i>Add CSS for styling:
.highlight {
background: rgba(68, 134, 255, 0.35);
}
.search__info {
padding: .5em 0;
margin-bottom: .5em;
border-bottom: 1px solid #eee;
}
.search__did_you_mean {
color: #ff421e;
}Use Helpers::highlightFoundWords() with custom pattern:
use Baraja\Search\Helpers;
$highlighted = Helpers::highlightFoundWords(
haystack: $text,
words: $query,
replacePattern: '<mark>\0</mark>',
);The search engine handles accented characters intelligently:
- ASCII conversion: Queries are converted for matching (
cafΓ©matchescafe) - Accent-aware highlighting: Original text preserved with proper highlighting
- Character mapping: Supports Czech, Slovak, Polish, and other Central European languages
Supported character mappings:
amatchesΓ‘,Γ€cmatchesΔematchesΓ¨,Γͺ,Γ©,ΔnmatchesΕrmatchesΕ,ΕsmatchesΕ‘,ΕzmatchesΕΎ,ΕΊ- And more...
InvalidArgumentException: Column "title" is not valid property of "App\Entity\Article".
Did you mean "headline"?
The package validates column names against entity metadata. Check your entity properties or use the suggested alternative.
- Verify entity has data in the database
- Check if columns contain searchable text
- Try disabling query normalization for debugging
- Verify WHERE conditions aren't too restrictive
- Add database indexes on searched columns
- Reduce the number of entities/columns in search
- Lower the search timeout
- Use
!modifier for large text columns - Consider
_modifier for columns only needed in results
Jan BarΓ‘Ε‘ek
- Website: https://baraja.cz
- GitHub: @janbarasek
baraja-core/doctrine-fulltext-search is licensed under the MIT license. See the LICENSE file for more details.