Lucene Configuration and Indexing
The indexing administration console allows you to manage the search index lifecycle. It provides complete control over the Lucene engine and enables you to perform the maintenance operations necessary for optimal performance.
Accessing Indexing Tools
Access Path: Tools > AdminTools Menu > Select AdminFulltextIndex

Index Status
Real-time Statistics
Display: Dashboard showing the current indexing status
Available Information:
- Indexed documents: Number vs total
- Indexed files: Processed attachments
- Index size: Disk space occupied
- Last update: Timestamp
Interpretation:
| Status | Meaning | Action |
|---|---|---|
| Indexed documents < Total | Indexing behind | Force indexing |
| Size > 10 GB | Large index | Optimization recommended |
| Last update > 1h | Task blocked | Check ProcessTimers |
| 100% indexed | Healthy status | No action needed |
Indexing Operations
Force Reindexing
Button: "Force reindexing"
Function: Launches a global index update
Types:
Incremental indexing:
- Indexes only new or modified documents
- Fast (a few minutes)
- Usage: Daily updates
Complete indexing:
- Rebuilds the index from scratch
- Slow (several hours on large volumes)
- Usage: After major configuration changes
When to use it:
- After modifying indexed fields
- After bulk imports
- If documents cannot be found
- After enabling attachment indexing
- Following an indexing error
Procedure:
- Choose the type (incremental vs complete)
- Click "Force"
- Wait for completion (success message)
- Check the statistics
Precautions:
- Perform during off-peak hours
- May temporarily slow down the application
- For large volumes (>10000 docs), schedule outside production
Index Optimization
Button: "Optimize index"
Function: Defragments and speeds up searches
Principle:
- Indexing tends to fragment the index over time
- Optimization consolidates segments
- Improves search performance by 20 to 50%
Recommended frequency:
- Small databases (< 10,000 documents): Monthly
- Medium databases (10,000 - 100,000): Weekly
- Large databases (> 100,000): Daily or via scheduled task
Procedure:
- Click "Optimize"
- Wait for completion (may take several minutes)
- Check the success message
- Test search performance
Automatic scheduling:
- Configure via AdminTools > ProcessTimers
- Create an "Index optimization" task
- Schedule during nighttime
Explore Document Index
Function: Allows technical verification of indexed content for a specific document
Usage: Debug why a document is not found
Procedure:
- Enter the document identifier (ID)
- Click "Explore"
- Review the results:
- Indexed fields
- Content of each field
- Extracted terms
- Indexing status
Diagnosis:
- If document doesn't appear: Not indexed
- If a field is missing: Configuration needs adjustment
- If content is empty: Extraction problem
Performance Optimization
Best Practices
1. Field targeting:
- Index only relevant fields
- Avoid technical fields (ID, GUID)
- Prioritize rich textual fields
2. Attachment limitations:
- Very large files (> 50 MB): Consider exclusion
- Check IFilter availability
- Monitor index size
3. Regular optimization:
- Schedule automatic optimization
- Frequency adapted to volume
- Monitor performance improvements
4. Size monitoring:
- Index > 10 GB: Urgent optimization
- Index > 50 GB: Configuration review
- Periodic cleanup of obsolete documents
Performance Indicators
Search time:
- Target: < 1 second
- If > 3 seconds: Optimization needed
- Test with real queries
Indexing time:
- Simple document: < 1 second
- Document with attachments: 2-5 seconds
- If higher: Investigation needed
Business Case: "Invisible" Document
Problem: A Quality Manager just uploaded a procedure "PRO-2024.pdf" and cannot find it in search 2 minutes later.
Diagnosis:
-
Check indexing status:
- Access AdminTools > AdminFulltextIndex
- Review statistics
- Check "Last update"
-
Check scheduled task:
- AdminTools > ProcessTimers
- "Incremental indexing" task
- Check last execution
Solution:
- If task hasn't run:
- Force manual Incremental Indexing
- If task OK but document missing:
- Check configuration ("Procedure" form indexed?)
- Check filters (document excluded?)
- Wait for processing to complete
- Test the search
Result: Document appears instantly
Business Case: Material Certificate Search
Context: The Quality department receives dozens of material certificates in PDF daily. Technicians need to find a certificate by searching for the batch number inside the PDF.
Configuration:
-
FullText search setup:
- "Material certificate" form
- ☑ Index attachments
- Fields: Title, Reference, Supplier, Date
-
IFilters verification:
- Verify PDF IFilters are installed
- Test text extraction
-
Complete rebuild:
- AdminTools > AdminFulltextIndex
- Launch Complete reindexing
- Wait (may take 1-2h depending on volume)
-
Testing:
- Search for a known batch number
- Verify the certificate appears
- Test with multiple terms
Result: Users can now type "LOT-2024-A456" and instantly find the corresponding certificate, even if this number only appears in the PDF.
Preventive Maintenance
Regular Controls
Daily:
- Verify indexing task is running
- Check indexing errors
Weekly:
- Check statistics (indexed documents vs total)
- Run optimization
- Test search performance
Monthly:
- Analyze index size
- Clean up obsolete documents
- Check configuration (new forms?)
- Export statistics
Quarterly:
- Complete configuration review
- Adjust indexed fields
- Train users on advanced search
- Performance analysis
Alerts
Configure alerts for:
- Indexing task failing > 3 times
- Index > 90% of documents for > 24h
- Index size > critical threshold
- Search time > 5 seconds
Troubleshooting
Non-indexed Documents
Symptoms: Some documents are not findable
Diagnosis:
- Check configuration (form indexed?)
- Check filters (document excluded?)
- Explore specific document index
- Review indexing logs
Solutions:
- Adjust configuration
- Force reindexing
- Check file access permissions
Slow Indexing
Symptoms: Indexing takes a long time
Causes:
- Large attachments
- Slow IFilters
- Fragmented database
- Insufficient server resources
Solutions:
- Limit indexed file sizes
- Optimize database
- Schedule indexing during off-peak hours
- Increase server resources
Slow Searches
Symptoms: Results take > 5 seconds to display
Causes:
- Non-optimized index
- Fragmented index
- Overly broad query
Solutions:
- Run optimization
- Reduce number of indexed fields
- Refine user queries
IFilters
What is an IFilter?
An IFilter is a Windows component that extracts text from specific files for indexing.
Required IFilters:
- PDF: Adobe PDF IFilter or equivalent
- Office: Included in Office (docx, xlsx, pptx)
- Legacy Office: IFilter for doc, xls, ppt
- Others: As needed (CAD, images with OCR, etc.)
Installation: See dedicated article on IFilters
Verification
Test extraction:
- Create a test document with an attachment
- Force indexing
- Explore the document index
- Verify attachment content is present
If failure:
- IFilter not installed
- Incompatible IFilter
- Corrupted file
Support
For advanced configurations (distributed index, replication, performance on very large volumes), contact Avanteam support.