Offline Mode for Web Application

The next step in improving the Chronic Reader app will consist of making it work offline. This is necessary for a reliable book reading app, because you will want to keep reading on the tube or during your flight, when your internet connection stops working.

To achieve this we will use a service worker which will get installed in the user's browser and act as a proxy between the application UI and the server. This approach will let us introduce functionality for offline usage of the app with very few changes to the UI. Our service worker will intercept all calls the UI is making to the server and it will add additional processing. It will save all resources required for the UI in an in-browser cache and also in an in-browser database. We will be using both these resources for different types of resources, with the resources saved to the database meant to be stored indefinitely by the browser.

Installing the service worker

The first step when adding a service worker: you need a javascript file to hold all your code. I have added a new file in my server resources named serviceworker.js.

window.onload = function() {
    if('serviceWorker' in navigator) {
        navigator.serviceWorker.register('/serviceworker.js')
            .then(function(registration) {
                registration.update()
            }, function(error) {
                console.log("service worker registration failed: ", error)
            })
    }
}

The next step is to install that service worker. We do this when one of our application pages loads and runs some javascript code to let the browser know we want to install a script as service worker. Not all browsers will have this functionality (old ones don't have it), so we should check if we can install before we actually do it.

Initializing the service worker

On the service worker side, everything is handled through events. An event is triggered when a message is sent from javascript on the UI to the service worker. The first event that will be triggered will be the install event.

self.addEventListener('install', e => {
    e.waitUntil(initCache())
    e.waitUntil(
        caches.keys().then(function(cacheNames) {
            return Promise.all(
                cacheNames.filter(function(cacheName) {
                    return cacheName != CACHE_NAME
                }).map(function(cacheName) {
                    return caches.delete(cacheName)
                })
            )
        })
    )
})

Our service worker will initialize the cache when detecting the install event.

var filesToCache = [
    '/book.css',
    '/book.js',
    '/bookNode.js',
    ...
]

function initCache() {
    return caches.open(CACHE_NAME).then(cache => {
        cache.addAll(filesToCache)
    })
}

Cache initialization will consist of downloading and storing resource files (css, js, images and fonts) to the browser cache. These files will then be available when the browser does not have access to the backend server.

self.addEventListener('activate', e => {
    self.clients.claim()
})

The second event usually triggered when installing a service worker is the activate event. When this happens, we just make sure the service worker is aware of all client tabs in the current browser and monitoring events that may originate there.

Initializing the database

The browser database will also have to be initialized before we can use it. The initialization step will create all necessary tables, and we need to define the table key name when we do this. We can also add indexes to limit the data we can push in the database.

var db

function getDb() {
    return new Promise((resolve, reject) => {
        if (! db) {
            const request = indexedDB.open(DATABASE_NAME, DATABASE_VERSION)
            request.onerror = function(event) {
                reject()
            }
            request.onsuccess = function(event) {
                db = event.target.result
                resolve(event.target.result)
            }
            request.onupgradeneeded = function(event) {
                let localDb = event.target.result
                var requestsStore = localDb.createObjectStore(REQUESTS_TABLE, {
                    keyPath: 'url'
                })
                requestsStore.createIndex(ID_INDEX, ID_INDEX, { unique: false })
                var progressStore = localDb.createObjectStore(PROGRESS_TABLE, {
                    keyPath: 'id'
                })
                var booksStore = localDb.createObjectStore(BOOKS_TABLE, {
                    keyPath: 'id'
                })
                var workerStore = localDb.createObjectStore(WORKER_TABLE, {
                    keyPath: 'id'
                })
            }
        } else {
            resolve(db)
        }
    })
}

function deleteDb() {
    return new Promise((resolve, reject) => {
        var req = indexedDB.deleteDatabase(DATABASE_NAME)
        req.onsuccess = function () {
            console.log("Deleted database successfully")
            resolve()
        }
        req.onerror = function () {
            console.log("Couldn't delete database")
            reject()
        }
        req.onblocked = function () {
            console.log("Couldn't delete database - operation blocked")
            reject()
        }
    })
}

getDb()

The approach used keeps the database load and initialization inside the accessor method. Sometimes, after the application has not been used for a while, the db variable will lose its value. This is why we should always get access to the database through the getDb() method, which will reinitialize the db variable if necessary. We also have functionality for deleting the whole database, which we need to resetting the application. On this database delete method you can see we are using a Promise when executing code. Most of the code in the service worker is based on promises, because we want most of these operations to be executed asynchronously.

Intercepting communication

The service worked interacts with the UI by intercepting messages and events sent from the UI. The main event we are interested in is the fetch event, which is triggered every time the UI is sending a request to the (a) server.

self.addEventListener('fetch', e => {
    var url = new URL(e.request.url)

    if (url.pathname === '/markProgress') {
        e.respondWith(handleMarkProgress(e.request))
    } else if (url.pathname === '/loadProgress') {
        e.respondWith(handleLoadProgress(e.request))
    } else if (url.pathname === '/latestRead') {
        e.respondWith(handleLatestReadRequest(e.request))
    } else if (url.pathname === '/imageData' 
        || url.pathname === '/comic' 
        || url.pathname === '/bookResource' 
        || url.pathname === '/book') {
        e.respondWith(handleDataRequest(e.request))
    } else if (url.pathname === '/bookSection') {
        e.respondWith(handleBookSectionRequest(e.request))
    } else if (url.pathname === '/') {
        e.respondWith(handleRootRequest(e.request))
    } else if (url.pathname === '/search') {
        e.respondWith(handleSearchRequest(e.request))
    } else if ((url.pathname === '/login' && e.request.method == 'POST') 
        || (url.pathname === '/logout')) {
        e.respondWith(handleLoginLogout(e.request))
    } else if (filesToCache.includes(url.pathname)) {
        e.respondWith(handleWebResourceRequest(e.request))
    } else {
        e.respondWith(fetch(e.request))
    }
})

We will handle different requests to our servers in different ways. The main distinction is between a database-first or a server-first behavior. More exactly, with database-first requests we will try to load data from the browser database or cache and provide that data to the UI. Only if data is not available there will we send the request to the server. Server-first requests will be handled in the opposite way, we will first try to load data from the server, and only if this fails will we get the data we may have in the browser database or cache.

We will use the database-first approach for large book resource data. If we have the book or comic book contents in the database already, the book is present on device, we don't need to send that call to the server and waste time, we can return the response from the database. This approach will enable the offline behavior of the application, but it will also function as an on-device cache. Even if you have internet connection, the book will be served from the device, the pages will load faster and the reading experience will be better, with reduced to non-existent loading times.

The server-first approach is used with application resources: style sheets, javascript files. These resources may change as we update the backend. We want to always get the latest version of these resources from the server, if possible. These resources will only be loaded from the browser cache when there is no internet connection, to keep the application running.

async function handleDataRequest(request) {
    let databaseResponse = await databaseLoad(REQUESTS_TABLE, request.url)

    if (databaseResponse) {
        return databaseEntityToResponse(databaseResponse)
    } else {
        return fetch(request)
    }
}

An example of the database-first approach, the handleDataRequest method will try to load some book resource for the UI. The first thing we do is try to load the response from the REQUESTS_TABLE in the database, using the request.url. If we have a response available, we can just return it. If we don't we'll try to load that response from the server by calling the fetch method.

async function handleLoadProgress(request) {
    await syncProgressInDatabase()

    let url = new URL(request.url)
    let id = parseInt(url.searchParams.get("id"))

    let serverProgress
    try {
        serverProgress = await fetch(request)
    } catch (error) {
        serverProgress = undefined
    }

    if (serverProgress) {
        return serverProgress
    } else {
        let databaseProgress = await databaseLoad(PROGRESS_TABLE, id)
        let databaseResponse = new Response(databaseProgress.position, {
            headers: {'Content-Type': 'application/json'}
        })
        return databaseResponse
    }
}

A more complex scenario is necessary for handling progress save and load. Progress is the user's position in a book. This position gets saved every time a user flips a page. The latest position needs to be saved on the server, so that the user's position in a book is synchronized across multiple devices. But when a user is reading a book in offline mode, the position update can't make it to the server. So we save that position in our device database. But once internet connection is available again, we must send the latest position to the server. All this logic is handled in the handleLoadProgress method. The first thing we do is try to sync on-device progress with the server. Then we try to load the latest progress from the server. If this fails, we load the latest progress from the browser database.

Custom messages

Our service worker can also listen for custom messages sent by the UI through the message event. In our application we use these for special functionality, to request that a book gets stored on the device, to reset the application.

self.addEventListener('message', event => {
    if (event.data.type === 'storeBook') {
        var id = parseInt(event.data.bookId)
        var size = parseInt(event.data.maxPositions)
        triggerStoreBook(id, event.data.kind, size)
    } else if (event.data.type === 'deleteBook') {
        deleteBookFromDatabase(event.data.bookId)
    } else if (event.data.type === 'reset') {
        resetApplication()
    }
})

async function resetApplication() {
    // delete all data from cache
    await caches.delete(CACHE_NAME)

    // delete all data from database
    await databaseDeleteAll(REQUESTS_TABLE)
    await databaseDeleteAll(BOOKS_TABLE)
    await databaseDeleteAll(PROGRESS_TABLE)
    await databaseDeleteAll(WORKER_TABLE)
    //await deleteDb()

    // unregister service worker
    await self.registration.unregister()
}

Book storing will be discussed in the next article. For resetting the application, we delet all cache contents and delete the dabase, then unregister the service worker. This is always done on logout, we don't want to keep book and user data in the browser if the user is no longer logged in. This reset can also be requested by the user from the UI, to fix issues that may arise when the application is used.

Database utility methods

I will also add here a set of database utility methods, which can be used with all tables to perform some common database operations we need in our application.

function databaseSave(table, value) {
    return new Promise((resolve, reject) => {
        getDb().then(db => {
            let transaction = db.transaction([table], "readwrite")
            transaction.oncomplete = function(event) {
                resolve(value)
            }
            let objectStore = transaction.objectStore(table)
            value['date'] = new Date()
            let addRequest = objectStore.put(value)
        })
    })
}

We save data to the database in a transaction. We always add a date field to the object that contains the latest moment the object was saved. If an object with that key exists in the database, it will be overwritten, and the date will reflect this.

function databaseLoad(table, key) {
    return new Promise((resolve, reject) => {
        getDb().then(db => {
            let transaction = db.transaction([table])
            let objectStore = transaction.objectStore(table)
            let dbRequest = objectStore.get(key)
            dbRequest.onsuccess = function(event) {
                resolve(event.target.result)
            }
        })
    })
}

Loading from the database by key is a very simple operation, and the result should be a single object.

function databaseDeleteAll(table) {
     return new Promise((resolve, reject) => {
        getDb().then(db => {
            let transaction = db.transaction([table], "readwrite")
            let objectStore = transaction.objectStore(table)
            let deleteRequest = objectStore.clear()
            deleteRequest.onsuccess = event => {
                resolve()
            }
        })
    })
}

Deleting all objects in a table is also a simple operation achieved by calling the clear method on the object store.

function databaseDelete(matchFunction, table, indexName = undefined, 
                        indexValue = undefined) {
    return new Promise((resolve, reject) => {
        getDb().then(db => {
            let transaction = db.transaction([table], "readwrite")
            let objectStore = transaction.objectStore(table)

            let cursorRequest
            if (indexName) {
                let index = objectStore.index(indexName)
                cursorRequest = index.openCursor(IDBKeyRange.only(indexValue))
            } else {
                cursorRequest = objectStore.openCursor()
            }

            let deletedCount = 0
            cursorRequest.onsuccess = event => {
                let cursor = event.target.result
                if (cursor) {
                    if (matchFunction(cursor.value)) {
                        objectStore.delete(cursor.primaryKey)
                        deletedCount += 1
                    }
                    cursor.continue()
                } else {
                    resolve(deletedCount)
                }
            }
        })
    })
}

The databaseDelete method introduces more complex functinoality. We usually delete information from the database in bulk, all data for a book gets deleted at once when the book is no longer part of the latest read section. To do this, we must match all objects that have the correct book ID and delete them. We do this by opening a cursor on the target table, then applying the provided matchFunction to every entry in the table and deleting that entry if the matchFunction returned true. This is further optimized if we provide an indexName and an indexValue parameter. In this case, a cursor will be opened only on that index and only database entries matching the indexValue will be traversed and processed with the matchFunction. However, at the time of writing, this optimization does not work on the Safari browser on iOS because of a bug related to accessing database entries through the cursor.

function databaseFindFirst(matchFunction, table, indexName = undefined,
                           indexValue = undefined) {
    return new Promise((resolve, reject) => {
        getDb().then(db => {
            let transaction = db.transaction(table)
            let objectStore = transaction.objectStore(table)
            let cursorRequest
            if (indexName) {
                let index = objectStore.index(indexName)
                cursorRequest = index.openCursor(IDBKeyRange.only(indexValue))
            } else {
                cursorRequest = objectStore.openCursor()
            }
            cursorRequest.onsuccess = event => {
                let cursor = event.target.result
                if (cursor) {
                    if (matchFunction(cursor.value)) {
                        resolve(cursor.value)
                    } else {
                        cursor.continue()
                    }
                } else {
                    resolve()
                }
            }
        })
    })
}

The databaseFindFirst method will similarly use a matchFunction to identify the desired database entry, but it will return the first value that matches that function.

function databaseLoadDistinct(table, column) {
    return new Promise((resolve, reject) => {
        getDb().then(db => {
            let transaction = db.transaction([table])
            let objectStore = transaction.objectStore(table)
            let cursorRequest = objectStore.openCursor()
            let distinctValues = new Set()
            cursorRequest.onsuccess = event => {
                let cursor = event.target.result
                if (cursor) {
                    distinctValues.add(cursor.value[column])
                    cursor.continue()
                } else {
                    resolve(distinctValues)
                }
            }
            cursorRequest.onerror = event => reject()
        })
    })
}

We also need a loadDistinct function to understand what books we have in the database currently. For this we again open a cursor and go over all entries in the database, collect in a Set object all values for the desired column. We then return this set which will contain every distinct value once.

Other considerations

As already mentioned, one thing we must ensure is that one user's data is not available for another user. We must also make sure that if a user logs out, their data does not remain on the device. To do this, we reset the service worker on every logout (also when login is detected), which will delete all the data stored in the browser for our application. The data relevant for the current user will be downloaded to the device after login. This reset option can also be triggered by the user, to be used if problems in how the application is functioning arise.

Downloading books to the device will require some additional engineering. Books and comic books have different kinds of resources, so their download processes will be different. Every time the latest read books are loaded, we need to check and make sure all of them are available on the device, and for those that are not we must download them. And when downloading books to the device, we must throttle this process and make sure we don't try to load all pages for all books from the server at once. Ideally our client should only download one resource at a time, and for this we need a method to control how the download requests are handled by the service worker even when multiple UI clients may request different book downloads at once from the service worker. All these aspects will be discussed in the next article about the Chronic Reader app.

You can find the full code for the service worker in the Chronic Reader repository.