Quentier local storage implementation overview

Series: Developers guide

This post continues the exploration of Quentier’s internals: here we’ll talk about the implementation details of Quentier’s local storage.

As I mentioned in the brief technical overview post, the local storage within libquentier uses SQLite as its backend i.e. all the data downloaded from Evernote (or user’s own data in case of local account) is stored within a SQLite database. However, it is just an implementation detail: the fact of SQLite usage is completely opaque to any clients of local storage functionality within libquentier. So in theory is is possible to change the database backend in future to whatever one is considered better - even to storage of plain files within some folder. Internally SQLite is used through Qt’s QtSql module.

Libquentier provides both synchronous and asynchronous versions of local storage management - which correspond to LocalStorageManager and LocalStorageManagerAsync classes within the public libquentier API. The asynchronous interface offers signals and slots for asynchronous communication with the local storage for clients. In Quentier local storage operations run within a separate dedicated thread of execution so the communication with the outside world via signals and slots is truly asynchronous. As such communication is asynchronous, each signal and slot involved has a special parameter of type QUuid which is called requestId in most cases. This identifier allows the clients of local storage to identify which particular signal from the local storage is the response to their query i.e. to the invokation of some LocalStorageManagerAsync’s slot - the response signal would have the same request identifier as the one passed to the slot.

LocalStorageManagerAsync maintains the in-memory cache of several most recently added/updated/looked up data tems following the assumption that these items would be quickly accessed again. At the time of this writing there’s no collection of cache hit/miss statistics although it would be nice to implement it in order to see how useful that in-memory cache really is. Note that this cache is not the same as the in-memory cache of data maintained by SQLite itself - SQLite’s in-memory cache is completely opaque to the code requesting stuff from the database; LocalStorageManagerAsync’s cache is only opaque to its own clients. Libquentier’s API makes it possible to disable LocalStorageManagerAsync’s cache by calling setUseCache(false) on LocalStorageManagerAsync instance. Libquentier also allows one to customize the cache expiration function by passing an object of a custom subclass of ILocalStorageCacheExpiryChecker to LocalStorageManagerAsync’s installCacheExpiryFunction method. The methods of this class evaluate when it’s time to drop the least recently used data items from the cache.

More involved interface for working with the cache is accessible with the help of LocalStorageCacheManager class the const pointer to which can be retrieved from LocalStorageManagerAsync via localStorageCacheManager() method. The returned pointer would be nullptr if the cache was disabled.

LocalStorageManagerAsync acts as a wrapper around synchronous LocalStorageManager class so its role is to implement the asynchronous interface around the synchronous one and also to maintain the cache of most recently used data items.

LocalStorageManager’s interface is pretty self-explanatory and also contains quite complete Doxygen documentation: it offers the ability to add/update/search/remove/list (with limit, offset and sorting order) notes/notebooks/tags/saved searches/resources (attachments to notes).

Now some info about the private implementation of local storage management.

First, if you not familiar with pimpl idiom, you can read about it here. In C++/Qt/KDE the pattern is also known as d-pointer. Pimpl idiom is the way in which private local storage management implementation is hidden from the clients using the API. Is it a pretty standard pattern for shared libraries intended to prevent the breakage of ABI due to internal implementation changes.

In context of local storage management public LocalStorageManager class has a single pointer to its opaque implementation class: LocalStorageManagerPrivate:

QT_FORWARD_DECLARE_CLASS(LocalStorageManagerPrivate)

class QUENTIER_EXPORT LocalStorageManager: public QObject
{
    Q_OBJECT
public:
    <...>

private:
    QScopedPointer<LocalStorageManagerPrivate>  d_ptr;
    Q_DECLARE_PRIVATE(LocalStorageManager)
};

This private class is declared in src/local_storage/LocalStorageManager_p.h (and the implementation is within .cpp file with the same base name). The same approach is used for the implementation of LocalStorageCacheManager, it is declared in src/local_storage/LocalStorageCacheManager_p.h. The actual implementation of the cache is not very simple: it uses multi-index container from boost library and one of indices is the last access timestamp. It might be a little overkill compared to some simplistic LRU cache based on a map and a linked list but several other indices are used as well so it won’t be just a single key map anyway.

The contents of LocalStorageManagerPrivate class (the one declared in src/local_storage/LocalStorageManager_p.h) are rougly as follows: * public methods repeating those specified in the public interface of LocalStorageManager class - these are the methods to which the control is handed over by the public class’ methods * private methods * private class members

Let’s start from one single private method which provides the most important information about how the local storage database is organized: createTables. This method is called every time the account is switched as well as when the local storage manager is created for some particular account. However, this method only really does real work just once, when the account is created for the first time because in this case the tables within SQLite database are actually being created. All the next times their creation is simply skipped. Here’s a simple example of what happens within createTables method:

res = query.exec(QStringLiteral("CREATE TABLE IF NOT EXISTS Users("
                                "  id                              INTEGER PRIMARY KEY     NOT NULL UNIQUE, "
                                "  username                        TEXT                    DEFAULT NULL, "
                                "  email                           TEXT                    DEFAULT NULL, "
                                "  name                            TEXT                    DEFAULT NULL, "
                                <... more rows here ...>
                                ")"));
errorPrefix.setBase(QT_TR_NOOP("Can't create Users table"));
DATABASE_CHECK_AND_SET_ERROR();

Each table has some column serving as the primary key. For most tables such column is called localUid. It is the complement for Evernote’s guid - globally unique identifier. Guids are assigned to data items by Evernote servers, not by the client apps. So if Quentier was to rely on guids as primary keys, it would have to contant Evernote servers each time a new note or notebook or tag or saved search or note attachment is created. Doing so would effectively kill the ability to work with the notes within Evernote account offline. So the local storage uses the unique key of its own - localUid or local unique identifier. This key is not sent to Evernote during the synchronization so only libquentier’s client apps know about its existence. Local uid is primarily used to support data items both already synchronized with Evernote and not yet synchronized. In local accounts data items are not synchronized with Evernote at all so they would need some kind of unique identifier anyway.

Some tables have columns referencing other tables’ columns - those are subjects to foreign key constraints meant to help maintain the integrity of the database and reject attempts to do something which, if performed, would leave the database in some inconsistent state. One example of such a constraint is as follows:

res = query.exec(QStringLiteral("CREATE TABLE IF NOT EXISTS Notes("
                                "  localUid                        TEXT PRIMARY KEY     NOT NULL UNIQUE, "
                                "  guid                            TEXT                 DEFAULT NULL UNIQUE, "
                                "  updateSequenceNumber            INTEGER              DEFAULT NULL, "
                                <... more rows here ...>
                                "  notebookLocalUid REFERENCES Notebooks(localUid) ON UPDATE CASCADE, "
                                "  notebookGuid REFERENCES Notebooks(guid) ON UPDATE CASCADE, "
                                <... more rows here ...>
                                "  UNIQUE(localUid, guid)"
                                ")"));
errorPrefix.setBase(QT_TR_NOOP("Can't create Notes table"));
DATABASE_CHECK_AND_SET_ERROR();

The foreign key constraints for notes mean that they should only contain notebook local uid and/or guid which are present within the notebooks table. Otherwise it won’t make sense - note says it is from a notebook which doesn’t exist! The database which allowed that to happen would have to be considered corrupted. Possibly it could be repaired by manually creating such a notebook but it’s not something the user of the app would be happy to do so it’s better to prevent the possibility of such corruption altogether.

Notice that the specification on notebook local uid and guid columns is ON UPDATE CASCADE and not ON UPDATE CASCADE ON DELETE CASCADE. The latter one would mean that the removal of a notebook should lead to the removal of all its notes but unfortunately this ON DELETE CASCADE interferes with specific kind of updates within notebooks table - the updates of INSERT OR REPLACE form. Apparently SQLite thinks that REPLACE corresponds to DELETE followed by INSERT. So on the update of a notebook it would delete the notebook and all the notes inside it and then insert the notebook back. But not the deleted notes.

In order to prevent that, another way to maintain the consistency on notebook removal has been implemented:

res = query.exec(QStringLiteral("CREATE TRIGGER IF NOT EXISTS on_notebook_delete_trigger BEFORE DELETE ON Notebooks "
                                "BEGIN "
                                "DELETE FROM NotebookRestrictions WHERE NotebookRestrictions.localUid=OLD.localUid; "
                                "DELETE FROM SharedNotebooks WHERE SharedNotebooks.sharedNotebookNotebookGuid=OLD.guid; "
                                "DELETE FROM Notes WHERE Notes.notebookLocalUid=OLD.localUid; "
                                "END"));
errorPrefix.setBase(QT_TR_NOOP("Can't create trigger to fire on notebook deletion"));
DATABASE_CHECK_AND_SET_ERROR();

This trigger would be executed before the actual permanent removal of a notebook and would ensure the related tables stay consistent by removing those of their rows which reference the about to be deleted notebook.

In addition to “usual” tables there are also some virtual tables for efficient searching of stuff within the database. For example:

res = query.exec(QStringLiteral("CREATE VIRTUAL TABLE IF NOT EXISTS NotebookFTS USING FTS4(content=\"Notebooks\", "
                                "localUid, guid, notebookName)"));
errorPrefix.setBase(QT_TR_NOOP("Can't create virtual FTS4 NotebookFTS table"));
DATABASE_CHECK_AND_SET_ERROR();

FTS4 is one of SQLite’s implementations of full text search. Although it’s not of much importance in 2018, some Linux distros used to ship Qt4 versions with SQLite containing only FTS3 implementations - such versions of Qt4 won’t be suitable for libquentier - it would compile but tests won’t pass and in general things won’t work. So if you are building libquentier with Qt4, it is strongly recommended to run libquentier’s tests to ensure they pass and hence Qt’s SQLite supports FTS4.

Other private methods of LocalStorageManagerPrivate class serve as implementation pieces of public methods. These private methods can be roughly categorized as follows: * methods running INSERT OR REPLACE queries which are used by methods adding items to the local storage as well as by methods updating items within the local storage * methods running SELECT queries which are used by methods listing or searching items within the local storage * methods preparing INSERT OR REPLACE queries - more on this in a moment * other auxiliary methods

Preparing a SQL query roughly means converting it from a human readable string into SQLite byte code. For certain types of queries it is easy to do the preparation step once and then run the prepared query over and over again which results in removing the duplicate work each time the query needs to run. For this reason INSERT OR REPLACE queries are prepared once and then run several times until the account is switched or until LocalStorageManagerPrivate class is destroyed (when Quentier is shut down, for example). The data members of LocalStorageManagerPrivate class corresponding to these queries look like this:

QSqlQuery           m_insertOrReplaceSavedSearchQuery;
bool                m_insertOrReplaceSavedSearchQueryPrepared;

The bool serves for checking whether the query has already been prepared or not.

That wraps up the introduction to the local storage management internals of Quentier. Due to the fact the code performing this management lives within libquentier library, other applications can also make use of this code which, together with libquentier’s synchronization abilities, can enable other apps to easily perform full and incremental sync of Evernote account data to the local database.


This is a post in the Developers guide series.
Other posts in this series:

w