Posts on Programming

Asynchronous programming with promises in JavaScript


Posted by Diego Assencio on 2017.10.22 under Programming (JavaScript)

Asynchronous programming in JavaScript is unfortunately not trivial. One way of dealing with asynchronous work is by simply using plain callbacks when certain events are triggered (e.g. a file is loaded from a server), but this is error-prone since it is difficult to make sure that all errors and exceptions are properly handled, especially through multiple callback functions.

A Promise in JavaScript is a great tool for writing asynchronous code, but understanding how to use it is requires some effort. Sadly, a great deal of the literature written on the topic makes it seem much more complicated than it actually is, which is why I decided to write this post.

Let's first start with a statement of the problem we are trying to solve: we wish to do some work, but this work can take long (e.g. fetch a file from a server) and we cannot simply wait until it is done before continuing the execution of the program, otherwise the user experience may suffer considerably (e.g. the webpage the user is visiting would stop responding to events such as key pressing until the file is completely loaded). On top of that, when this work is done, we may wish to do additional work (e.g. fetch another file) based on the results obtained.

So how can promises help us solve this problem? The high-level answer is simple: a promise is an object which wraps a unit of work to be executed and acts as a handle to the outcome of this work. As soon as the outcome becomes available (e.g. our file has been loaded), it is handled by an appropriate user-defined callback function which is attached to the promise. We say that a promise is "resolved" (or "fulfilled") when the work it wraps finishes successfully and we say it is "rejected" when something goes wrong; to handle these situations separately, we can attach two callback functions to a promise: one to handle success and one to handle failure.

Before jumping into a code example using promises, consider the following naive attempt to execute an asynchronous file request:

/* WARNING: this will NOT work as expected! */

function getFile(fileUrl)
{
    /* prepare an asynchronous request for fileUrl */
    var request = new XMLHttpRequest();
    request.open("GET", fileUrl, true);

    /* specify what to do when the loading operation finishes */
    request.addEventListener("load", function() {
        if (request.status < 400)
        {
            /* successful execution: return the file contents */
            return request.responseText;
        }
        else
        {
            /* failed execution: throw an exception */
            throw new Error(request.statusText);
        }
    });

    /* send the request */
    request.send();
}

/* request a file; handle success and failure separately */
try
{
    var contents = getFile("files/John.json");
    console.log("file contents:\n" + contents);
}
catch (error)
{
    console.log("could not get file: " + error);
}

Our intention in the program above is to asynchronously download a JSON file and handle success (file downloaded) by printing its contents and failure (file not downloaded) by throwing an exception. If you do not understand all technical aspects of the program, read it this way: XMLHttpRequest is an API which we use to prepare an asynchronous file request. The request variable acts as a handle to the result of this file request. By registering a callback function to handle the request's "load" event, we can define what needs to be done when the loading operation finishes: if the HTTP status code returned is smaller than 400, the file was successfully downloaded and its contents are then returned by the callback function, otherwise an exception is thrown to indicate that the file request failed. Unfortunately, even when the file request succeeds, the output of this program shows that we made a serious mistake in our thinking:

file contents:
undefined

What happened here? The answer is simple: getFile does not return anything. The return statement appearing on its body belongs to the callback function which handles the load event from request; the returned value will never be seen by getFile itself. As a matter of fact, from the way the code is written, any value returned by the callback function will actually be lost — no one will receive it. In the program above, the try block invokes getFile, and since getFile does not return anything, its return value is assumed to be undefined, and that is exactly what the program prints as the "file contents".

Asynchronous programming is not that easy in JavaScript, but making the program above work correctly requires only a few changes if we use a Promise object. Our first corrected version will be kept similar to the original (incorrect) program, but we will improve it afterwards:

var fileUrl = "files/John.json";

function getFile(succeed, fail)
{
    /* prepare an asynchronous request for fileUrl */
    var request = new XMLHttpRequest();
    request.open("GET", fileUrl, true);

    /* specify what to do when the loading operation finishes */
    request.addEventListener("load", function() {
        if (request.status < 400)
        {
            /* success: resolve promise with file contents */
            succeed(request.responseText);
        }
        else
        {
            /* failure: reject promise with an error */
            fail(new Error(request.statusText));
        }
    });

    /* send the request */
    request.send();
}

function processSuccess(contents)
{
    console.log("file contents:\n" + contents);
}

function processFailure(error)
{
    console.log("could not get file: " + error);
}

/* request file asynchronously */
var requestHandle = new Promise(getFile);

/* define how to handle success and failure */
requestHandle.then(processSuccess, processFailure);

The main difference now is on how getFile is called and how it signals success or failure (see the highlighted lines): a promise (requestHandle) is created to wrap the file request; its constructor takes a function (getFile) which defines the work to be done (let's call it the "task function" from now on). The constructor of a promise immediately invokes its given task function with two arguments which are themselves functions which the task function must use to indicate success or failure respectively. These functions are not provided by the user, but by the promise constructor itself. On the definition of getFile, the parameters succeed and fail represent these functions: the former is used to return a value while the latter is used to return an error. If the file download succeeds, getFile calls succeed with the file contents as argument, otherwise fail is called with an error as argument indicating what went wrong. In both cases, the result is stored on requestHandle, i.e., requestHandle acts as a handle to the outcome of getFile.

Notice that the signaling of success or failure is asynchronous: it does not happen while getFile is executing, but only when the file request is finished — an event which can occur much later in time.

After getFile calls either succeed or fail, what happens to the result it sends? The answer is on the last line of the program: requestHandle calls its then method to register callback functions which handle success and failure respectively. As soon as requestHandle receives a result from getFile, it will call the appropriate callback function with that result as argument. In the program above, these callback functions are processSuccess (the "success handler") and processFailure (the "failure handler") respectively. If you attempt to run this program on your browser console, you should get the following output:

file contents:
{
    "name": "John",
    "age": 35,
    "mother": "Mary"
}

One unfortunate aspect of the way this program is written is the fact that we cannot invoke getFile with fileUrl as its single argument; instead, we converted fileUrl into a global variable so it can be used by getFile. This is ugly but can be easily fixed by changing getFile to make it create and return the promise which wraps the file request. Additionally, let's use the opportunity to improve the names of the success and failure handlers:

function getFile(fileUrl)
{
    /* auxiliary function to be invoked by the promise constructor */
    function getFileTask(succeed, fail)
    {
        /* prepare an asynchronous request for fileUrl */
        var request = new XMLHttpRequest();
        request.open("GET", fileUrl, true);

        /* specify what to do when the loading operation finishes */
        request.addEventListener("load", function() {
            if (request.status < 400)
            {
                /* success: resolve promise with file contents */
                succeed(request.responseText);
            }
            else
            {
                /* failure: reject promise with an error */
                fail(new Error(request.statusText));
            }
        });

        /* send the request */
        request.send();
    }

    return new Promise(getFileTask);
}

function displayFile(contents)
{
    console.log("file contents:\n" + contents);
}

function printError(error)
{
    console.log("could not get file: " + error);
}

/* request a file; handle success and failure separately */
getFile("files/John.json").then(displayFile, printError);

The last line of this program deserves appreciation: it clearly communicates that we wish to fetch the file files/John.json and then display its contents if the file request succeeds or print an error message if it fails.

Despite the fact that all this sounds very interesting, the programs above can be written without using promises in ways which some will find easier to understand. However, if you need to chain actions together, using promises will make your life a lot easier. Indeed, when the then method is called for a promise (let's call it the "original promise" from now on), it produces a new Promise object which can register its own success and failure handlers (let's call it the "then-promise" from now on). The then-promise will succeed or fail depending on the outcome of the result-handler callback which is invoked by the original promise, regardless of whether this is a success or failure handler. Specifically, the following will happen:

1.If the result-handler callback invoked by the original promise returns a non-promise value, the then-promise will be resolved with that value, i.e., its success handler will be called with that value as argument.
2.If the result-handler callback invoked by the original promise throws an exception, the then-promise will be rejected with that exception, i.e., its failure handler will be invoked with that exception as argument.
3.If the result-handler callback invoked by the original promise returns a promise, the success and failure handlers of the then-promise will be used as success and failure handlers for the returned promise respectively.

All of this may sound complicated, but I hope that seeing each of these cases in action will make things clearer for the reader.

Example #1: original promise handler returns a non-promise value

function getFile(fileUrl)
{
    /* same as before... */
}

function getAge(jsonData)
{
    return JSON.parse(jsonData).age;
}

function addTen(number)
{
    return number + 10;
}

function printResult(result)
{
    console.log("John's age plus 10 is: " + result);
}

function printError(error)
{
    console.log("something went wrong: " + error);
}

getFile("files/John.json").then(getAge)
                          .then(addTen)
                          .then(printResult, printError);

In this example, we omitted the callbacks for handling failures on the first two promises (notice the absence of a second parameter on the first two calls to then). This is allowed and equivalent to setting these failure handlers to null, in which case a rejection of the promise simply passes on the error to the then-promise to be handled by its own failure handler. In this program, any error will be passed on until it is received by printError, which will then print it. From the structure of the program, we can see that provided the file request succeeds, the result of the success handler of each promise is passed on to the success handler of its associated then-promise, forming a chain which results in the following output:

John's age plus 10 is: 45

Example #2: original promise handler throws an exception

The example below is the same as example #1, with a single difference: we modify getAge to have it simply throw an exception instead of return an age value:

function getFile(fileUrl)
{
    /* same as before... */
}

function getAge(jsonData)
{
    throw new Error("getAge does not want to share this data!");
}

function addTen(number)
{
    return number + 10;
}

function printResult(result)
{
    console.log("John's age plus 10 is: " + result);
}

function printError(error)
{
    console.log("something went wrong: " + error);
}

getFile("files/John.json").then(getAge)
                          .then(addTen)
                          .then(printResult, printError);

When getAge is called, it throws an exception, causing the first then-promise in the chain to be rejected with that exception as result. Since it only registered a success handler (addTen), it simply passes on the exception to the second then-promise, which then invokes its own failure handler (printError) with the exception as argument. The output of this program is then:

something went wrong: Error: getAge does not want to share this data!

Example #3: original promise handler returns a promise

The third and final example will illustrate the case in which a success handler returns a promise. If that happens, the then-promise will become a proxy to it, meaning the then-promise will be resolved or rejected with the same value or error as the promise returned by the original success handler.

In this example, we will attempt to fetch a file, and if that operation succeeds, we will request another file and return a promise which is a handle to the outcome of this second file request:

function getFile(fileUrl)
{
    /* same as before... */
}

function getMotherFile(jsonData)
{
    var motherName = JSON.parse(jsonData).mother;

    /* return a promise */
    return getFile("files/" + motherName + ".json");
}

function printAge(jsonData)
{
    console.log("age of John's mother: " + JSON.parse(jsonData).age);
}

function printError(error)
{
    console.log("something went wrong: " + error);
}

getFile("files/John.json").then(getMotherFile)
                          .then(printAge, printError);

The output of this program shows that the success handler for the first then-promise (printAge) is used as the success handler for the promise returned by getMotherFile itself:

age of John's mother: 62

Summary

This post explained how a Promise object wraps a unit of (possibly asynchronous) work and acts as a handle to its outcome, processing success or failure using dedicated callback functions which are registered through the promise's then method. The real advantage of using promises instead of manually handling asynchronous tasks comes from the fact that the then method also returns a promise, making work chains easier to create and understand.

Comments (0) Direct link

Determining types deduced by the compiler in C++


Posted by Diego Assencio on 2017.08.23 under Programming (C/C++)

Suppose you are debugging a C++ program and need to know what is the type which the compiler deduces for a certain expression. There are many complicated ways of doing this, but in this post, I will show you a very simple trick which allows you to determine types directly at compile time.

The trick is this: declare a class template but do not define it, then attempt to instantiate this class template with the expression whose type you are trying to determine. Here is an example:

/* class template declaration (no definition available) */
template<typename T>
class ShowType;

int main()
{
    signed int x = 1;
    unsigned int y = 2;

    /* (signed int) + (unsigned int): what is the resulting type? */
    ShowType<decltype(x + y)> dummy;

    return 0;
}

In the code above, we are trying to determine the type deduced by the compiler when we add a signed int (x) and an unsigned int (y). This type is decltype(x + y). When the compiler attempts to create an instance of ShowType<decltype(x + y)>, it realizes this is not possible and indicates the problem with a very helpful error message:

error: aggregate ‘ShowType<unsigned int> dummy’ has incomplete type and cannot
be defined

In this message, the compiler (in my case, gcc) is telling us that it tried to create an instance of ShowType<unsigned int> but failed at it. Therefore, the type of the expression x + y is decltype(x + y) = unsigned int. This ŕesulting type comes directly from the integer addition rules specified in the C++ language.

Let's try a more interesting example. In C++, the type deduction rules for template parameters are complex. When in doubt, you can use the trick above to determine which type is deduced by the compiler for a certain template parameter:

/* class template declaration (no definition available) */
template<typename T>
class ShowType;

template<typename T>
void my_function(T x)
{
    ShowType<T> dummy;

    /* do something with x */
}

int main()
{
    const int x = 3;
    my_function(x);

    return 0;
}

One common doubt which developers often have regarding the type T on my_function is: will it be deduced to be int or const int? As it turns out, since we are passing x by value, the compiler will deduce T to be int:

error: ‘ShowType<int> dummy’ has incomplete type

As a final example, let's consider auto. The rules for auto type deduction are usually the same as the ones for template types, but auto type deduction assumes that initializing expressions such as {1,2,3} represent initializer lists. Let's show that in practice:

#include <initializer_list>

template<typename T>
class ShowType;

int main()
{
    auto x = {1,2,3};

    ShowType<decltype(x)> dummy;

    return 0;
}

The error message tells us what we expect:

error: aggregate ‘ShowType<std::initializer_list<int>> dummy’ has incomplete
type and cannot be defined

Notice that we need to pass an actual type (not an expression) to ShowType<T>, so in the examples above in which we wanted to determine the type of a certain expression (e.g. x + y), we needed to enclose the expression with decltype. On the second example, we already had the desired type T available on the definition of my_function, so we could use it directly.

Comments (0) Direct link

How std::move breaks return value optimization


Posted by Diego Assencio on 2017.08.07 under Programming (C/C++)

Consider the following program:

std::vector<int> random_numbers(const size_t n)
{
    std::vector<int> numbers;

    /* generate n random numbers, store them on numbers */

    return numbers;
}

int main()
{
    /* create an array with 100 random numbers */
    std::vector<int> my_numbers = random_numbers(100);

    ...
}

Returning an std::vector<int> by value can make a developer nervous: will this cause the vector to be copied? That would be a costly performance penalty and therefore something to be avoided, if possible.

In this type of situation, a solid understanding of how return value optimization (or simply RVO, for short) works leads to inner peace. Any decent compiler will understand that numbers is a temporary variable whose value will be returned at the end of the random_numbers function, and since this returned value is an rvalue (a temporary object, see line 13), it can be used to initialize my_numbers directly, i.e., without going through any of the std::vector<int>'s constructors. In other words, no vectors should be copied or moved on the code above. How wonderful!

Let's take a closer look at how RVO removes the need for a copy operation. When random_numbers is called, memory on the call stack will be reserved for its return value (an std::vector<int>). Inside of random_numbers, numbers is clearly a temporary variable whose value will be returned at the end, so instead of copying this value to the memory location reserved for the function's return value at the very end, numbers is stored directly at that memory location. The elimination of such a copy operation when a function returns is what "return value optimization" stands for. In general, any type of optimization which causes copy operations to be eliminated is referred to as a copy elision optimization.

Despite all these facts, it is not uncommon for developers to be less confident than they should in this type of situation and prefer "being on the safe side" by writing this type of code:

std::vector<int> random_numbers(const size_t n)
{
    std::vector<int> numbers;

    /* generate n random numbers, store them on numbers */

    return std::move(numbers);   /* don't do this! */
}

As innocent as this decision may be, it is severely flawed because of the way RVO works: only a local variable or a temporary object can be stored directly at the memory location for a function's return value (function parameters are not eligible for that), and this is only allowed if such an object is directly returned by the function and has the same type as the function's return type. By adding std::move on the return statement above, we converted the type of the returned object to std::vector<int>&& (an rvalue reference to an std::vector<int>), but random_numbers returns an std::vector<int>. This violates the conditions required for RVO, and the compiler will have no choice but to make numbers be constructed outside the memory area reserved for random_numbers's return value and then moved to that location when the function returns (a move operation is still possible here since std::move(numbers) is an rvalue and will therefore trigger the move constructor for the std::vector<int> which is constructed as the return value).

For many user-defined types, such an additionally incurred move operation will be cheap, but it will definitely not be cheaper than what RVO offers and therefore, by "playing safe", we ended up inevitably pessimizing our program. Also, notice that we were lucky to have a return type (std::vector<int>) with a move constructor; had this not been the case, the added std::move would have caused the return value to be initialized through a copy constructor, which is what we wanted to avoid in the first place. Ouch!

To finalize, here is an example which involves all concepts discussed so far:

#include <iostream>

class X
{
public:
    /* default constructor */
    X() { std::cout << "X::X()\n"; }

    /* copy constructor */
    X(const X&) { std::cout << "X::X(const X&)\n"; }

    /* move constructor */
    X(X&&) { std::cout << "X::X(X&&)\n"; }
};

X good()
{
    std::cout << "good()\n";

    X x;
    return x;
}

X bad()
{
    std::cout << "bad()\n";

    X x;
    return std::move(x);
}

int main()
{
    X x1 = good();
    X x2 = bad();

    return 0;
}

The program's output illustrates how the std::move on the bad function disables RVO and forces an unnecessary move operation (this will be the case even if you compile with lots of optimizations enabled, e.g. by using -O3 on gcc):

good()
X::X()
bad()
X::X()
X::X(X&&)

Try compiling and running this program, then remove the move constructor from X. The resulting output shows that std::move now causes X to be copied:

good()
X::X()
bad()
X::X()
X::X(const X&)
Comments (1) Direct link