Full Text TXT Search In Delphi (+FireDac +SQLite). Sample Working Project.

  

Finally, the sample project for my full text search implementation in Delphi is ready. The idea involves providing a user of my application the ability to do full text (Google like) searches on a set of documents by searching inside document content. For the task, I’ve selected SQLite for the database and FireDac for the database connectivity and operations.

For the sake of simplicity and the proof-of-concept approach this sample program is really simple: it will index all TXT files under some directory (including sub directories) and allow to do simple “match” type full text search queries. Certainly, in my real world application, the task is more complex and involves non-TXT files, some Windows service application and alike … but for the start, if you are up to implementing full text search, this sample project should give you a kick-start.
Download Sample Project Source Code. Download Executable Only.
The database selected is SQLite. SQLite is a self-contained, serverless, zero-configuration, transactional SQL database engine, and has the FTS engine built inside.
To connect to and operate on the database I’ve picked FireDac. FireDac knows how to talk to SQlite (any many other databases) and has proven to be robust enough when working with SQLite.
Here’s how this sample program works:

The user selects a folder containing TXT files (in any subfolders)
If the full text index database has not been created (i.e. the .SDB file does not exist)

the database is created
TXT files and their content are imported into database

Display the files in a tree (Virtual Tree View)
Each file / folder node has a check box to allow selection
FTS queries can be done against the range of selected files

The Database: Multi User Read-Write SQLite Connection
SQLite is primarily used in single connection access to the database scenarios. In my scenario I have to allow multiple access (from various application and application instances) to open, read and write to the database. The TADConnection (+TADPhysSQLiteDriverLink) component is used to connect and talk to the database. By default, SQLite driver settings are set for single connection, to allow multiple connections one has to alter some parameters. Here’s how my connection parameters look:

DriverID=SQLite
Database= …
OpenMode=CreateUTF16
Password=aes-256:12345
LockingMode=Normal
JournalMode=WAL
Synchronous=Full
SQLiteAdvanced=temp_store=MEMORY
SQLiteAdvanced=page_size=4096
SQLiteAdvanced=auto_vacuum=FULL

SQLite database will be created at a connection establishment if it does not exist. I would suggest to read above the above parameters used in FireDac help.
After the connection has been established and the database has been created with tables to support FTS queries (+ referential integrity), we fill in the database:
The Data: TXT files and Their Content
A simple TDirectory.GetFiles is used to get all the TXT files in a selected directory.

var
subFiles : TStringDynArray;
subFile : string;
begin
subFiles := TDirectory.GetFiles(
main.MainForm.RootDirectory,
TSearchOption.soAllDirectories,
function(const Path: string; const SearchRec: TSearchRec): Boolean
begin
result := SameText(ExtractFileExt(SearchRec.Name), ‘.txt’);
end);

for subFile in subFiles do
UpdateFileInDB(subFile);
end;

Files are imported into the database one by one:

function TDM.UpdateFileInDB(const fileName: string; const removeOld : boolean = false): boolean;
var
sStream: TStringStream;
documentRowID : integer;
begin
result := true;

if removeOld then
sqliteConnection.ExecSQL(‘DELETE FROM Document WHERE FileName = ‘ + QuotedStr(fileName));

sStream := TStringStream.Create;
try
sStream.LoadFromFile(fileName);

try
MainForm.LogInMemo(‘ +’ + fileName);

sqliteConnection.ExecSQL(‘INSERT INTO Document VALUES (NULL, :fileName)’, [fileName]);

documentRowID := sqliteConnection.ExecSQLScalar(‘SELECT LAST_INSERT_ROWID() AS rowid’);

//1 page per document for the sake of simplicity
sqliteConnection.ExecSQL(‘INSERT INTO FTSData VALUES (NULL, :id, :page, :txt)’, [documentRowID, 1, sStream.DataString]);
except on E: Exception do
begin
result := false;
MessageDlg(‘Error writing to database:’ + E.Message, mtError, [mbOk], -1);
end;
end;
finally
sStream.Free;
end;
end;

Now, all the files have been “indexed” and they are presented in the tree view control. I’ve used Virtual TreeView.
The FTS Search
Finally, specify what files to include in the FTS search, specify your token (word) you are looking for and hit “Search”:

procedure TDM.RetrieveSearchResults(const searchToken: string; const inFiles : TStringList = nil);
var
fn : string;
st : integer;
begin
//get search results
sqlQuery.SQL.Clear;

//temp table for selected files
sqlQuery.SQL.Add(‘CREATE TEMPORARY TABLE IF NOT EXISTS PartOfDocument (FileName STRING);’);
sqlQuery.SQL.Add(‘DELETE FROM PartOfDocument;’);

if Assigned(inFiles) then
for fn in inFiles do
sqlQuery.SQL.Add(‘INSERT INTO PartOfDocument VALUES (‘ + QuotedStr(fn) + ‘);’);

//main fts query
sqlQuery.SQL.Add(‘SELECT Document.Id, Document.FileName FROM PageContent’);
sqlQuery.SQL.Add(‘INNER JOIN FilePages ON FilePages.rowid = PageContent.rowid’);
sqlQuery.SQL.Add(‘INNER JOIN Document ON Document.rowid = FilePages.DocumentID’);
if Assigned(inFiles) AND (inFiles.Count > 0) then
sqlQuery.SQL.Add(‘INNER JOIN PartOfDocument ON PartOfDocument.FileName = Document.FileName’);
sqlQuery.SQL.Add(‘WHERE PageContent MATCH ‘ + QuotedStr(searchToken));
sqlQuery.SQL.Add(‘;’);

sqlQuery.OpenOrExecute;

MainForm.SearchResults.Clear;
while NOT sqlQuery.Eof do
begin
MainForm.SearchResults.Add(
TSearchResult.Create(
sqlQuery.Fields[0].AsInteger, //doc.id
sqlQuery.Fields[1].AsString //file name
));
sqlQuery.Next;
end;

sqlQuery.Close;
end;

A temporary table is created (if it does not exist), filled in with selected file names. The actual search query is a JOIN between different tables but the most important part is the MATCH on the FTS table. Results are displayed in a list view.
That’s it. Questions? Comments? More than welcome!

Part 1: Full Text Search Functionality – Implementation Idea
Part 2: SQLite: Referential Integrity With Full Text Search Virtual Tables

Comments are closed.