Monday, January 4, 2021

Backpropagation through graphical models for new year's resolutions and planning out priorities for the next year

Around New Year's every year, I take some time to reflect on the successes and failures of the past year and decide what I should prioritize for the next year. Usually I do this using a combination of paragraphs and bullet point lists.

I recently took a deep learning class where I learned about a technique called backpropagation, which is used to find optimal parameters for a predictive model. It occurred to me that an analogous thought process could be used to help guide New Year's planning.

This is the first time I have gone through this exercise, so I'm making it up as I go along. If you decide to try this (or if you have already tried something similar), I would be really delighted to hear any feedback (good or bad) or suggestions you might have for improving the method. 

In a backpropagation process, a model is first used to make a prediction, for example predicting what digit is in a scanned picture. Then the difference between the prediction and the actual answer is calculated. When the model makes an incorrect prediction, backpropagation can be used to calculate how much blame each model parameter deserves for the wrong prediction. Model parameters that deserve a lot of blame can be given large adjustments, and model parameters that don't deserve any blame can be left unchanged. After many cycles of predictions, backpropagation, and parameter adjustments (together called gradient descent), a model slowly gets better at making predictions.



To use backpropagation for life-planning, I need to first define some life-planning primitives and map them to machine learning model primitives.

The model will consist of a directed graph (preferably acyclic) of three different kinds of nodes and one kind of directed edge. Every edge and node will have a footnote. After constructing the topology of the graph, I will go through and assign a value between -1 and 1 to every edge, and use that to calculate what I should put the most effort into in the coming year, and why.

Due to the difficulty of predicting/quantifying the effects of actions, this model should not be thought of as strictly quantitative, but more as quantitative-ish. The numbers are just to help guide my subjective reasoning. The hypothesis driving this exercise is that structured reasoning about priorities is better than haphazard decision making. Making a comprehensive or completely mathematically coherent model is not my intention (and I don't think it would be possible even if I wanted to).

In addition to the graph and footnotes, there is also a section for an overall conclusion.

Actual picture of me after finishing this exercise

A graph of possible things I could do in 2021, and how they might make me more or less useful.
See discussion below for details on symbols. (a higher resolution version is available in the spreadsheet linked at the very bottom of this post)

Objectives (triangle nodes)

First, I need to define what I am trying to accomplish. I'm a consequentialist utilitarian, so my objective this year, like every other year, is to reduce human and animal suffering over the short term and long term, and to avoid disappointing or alienating my friends and family too much in the process (not because I think that my friends/family are more important than anyone else, but because they are acutely sensitive to my failings, so I need to be particularly aware of their interests).

It's likely that you are not a consequentialist utilitarian, so your objectives will likely be something else, maybe "save money for a house down-payment", or "go golfing as much as possible", or "gain new followers for xyz church or political party", or "remain on good terms with my partner".

If there are multiple objectives, then we need to figure out how much to weight each one. I represent this prioritization by drawing a single combined sink node "total utility", then drawing edges from all of the other objectives into the sink objective and assigning different weights to the different edges (see Edges section below).

In my case, I rank my priorities approximately like this:

reducing human suffering >> not disappointing my family > reducing animal suffering > maintaining my personal health. 

(Whether this is a justifiable ranking of priorities is beyond the scope of this post)


Decision variables (rectangular nodes)

I also need to define my decision variables (or model parameters). These are the direct actions I could take, but am trying to decide whether I should take and if so, how much effort I should put into them. These are things I have a high confidence in my ability to directly influence and which I have not yet decided how much effort to put into (if any) over the coming year.

Enumerating possible actions can quickly become frustrating and overwhelming, so it's important to focus only on things which could potentially require a lot of time or resource commitment or have a large impact on my effectiveness, and which I am unsure about how much to commit to.

For example, things like "don't eat any meat", or "bike to work every day" are not decision variables for me, even though they may contribute a lot to my objectives and require substantial time and effort. However, I have zero intention of eating meat this year and complete intention to bike to work every day, so they aren't questions I'm still trying to make up my mind about.

Try to keep this as a relatively small list, between 5 and 20 possible actions. I ended up with 16 decision variables.

While trying to decide on your decision variables, you might think of lots of little things that you should do (or stop doing) but which won't require lots of resources or have a large influence on your objectives. You should write these down somewhere so you remember them, but it might not be worth the effort go through the effort of describing them in detail and going through the whole back-propagation exercise with them.

In my case, I thought of scheduling a dentist appointment and starting to take a multivitamin.

Each decision variable should have a footnote. The footnote should define in detail the range of possible commitment.

After constructing the graph, I will go back and calculate the possible contribution of each decision variable to my overall utility. I will also write out a conclusion for what I am going to try to do (or not) for this decision variable in the upcoming year.


Hidden variables (oval nodes)

These are factors I can't directly control, but which are influenced by things I can control (the "decision variables"), and which may have an influence on the objectives or on other hidden variables.

The purpose of enumerating the hidden variables is to create a conceptual bridge between decision variables and the objectives to help with reasoning about how much the decision variables will influence the objectives and interact with other decision variables.

Each hidden variable should have a footnote of about 1-5 sentences in length.

Try to keep the number of hidden variables approximately the same (could be a little higher or lower) as the number of decision variables. I ended up with 18 hidden variables.


Lines of influence (directed edges)

Draw a line from a node that causes a change to a node where the change is caused. For example "strength training" can improve health, so I draw an edge from "strength training" to "physical health".

The edges represent an opportunity to explicitly layout and explain causal connections.

It might be tempting to draw lots of complicated connections (reality is complicated, and everything we do is unavoidably connected to everything else we do), but try to keep the graph relatively simple.  In particular try to avoid loops. If you must have a loop, it's not the end of the world, but they might make it difficult to attribute influence and run the backpropagation algorithm.

You will have to write a footnote for every edge, so if you can't think of anything insightful to write about an edge, then it's probably best to just leave it off.

I assign each edge a number between -1 and +1 indicating how strong of an influence the source node has over the destination node. -1 means that more time/effort spent on the source node will decrease the strength of the destination node by a lot. +1 means that more time/effort spent on the source node will increase the strength of the destination node by a lot. The footnote should be used to justify the value assignment.

Often times if you find yourself assigning a weight close to 0 to an edge, it might be better to just leave that (because the 0 means you don't think that the source node has much influence on the destination node). Exceptions would be if the assignment of 0 is somehow surprising, controversial, or counter-intuitive. For example in my graph, I draw -0.01 weight between "stop eating dairy products" and "animal agriculture" because I don't think my maximal dairy consumption is at a level that causes much increased demand for animal agriculture.


Footnotes

I list the footnotes in alphabetical order, with edges named by combining the name of the source and destination node.

Every node and edge should have a footnote. The format and content of the footnote will depend on the type of node/edge (see above for specifics).

I made my diagram with software from https://www.diagrams.net/, but I think something like graphviz or one of the many web-based graph drawing libraries, like d3 or cytoscape.js (something that has a force-directed node layout) would probably be easier and make nicer diagrams.


Backpropagation

I calculate the contribution of each decision variable to the value of the sink objective node. I think of each edge as a linear transformation, so the entire graph is just a simple linear network model, which makes the backpropagation math very simple.

Assign 1 to the sink objective (the derivative of the sink objective with respect to itself is 1). The value of every node should be the sum of the products of the edge weights and the values of the nodes immediately downstream. This means that you can't calculate the value of a node until after calculating the values for all nodes that it points to. I wrote some python code that will do the calculation, so you don't have to worry about it!

https://gist.github.com/seanrjohnson/9244b7ed086540d3a4b455dc17c88eea

I note the values calculated from backpropagation below each node (blue for positive values, red for negative values). These values are technically the derivatives of the sink objective (total utility) node with respect to each of the nodes. It means that if you increased the value of that node by 1 then the value of the sink objective would increase (or decrease, if the value is negative) by the amount in blue. 

The fact that this is an acyclic linear model makes the backpropagation relatively simple and intuitive. If there were loops or non-linear transformations involved, it would quickly become a headache. 


Summary

In addition to the footnotes, I write a conclusion with particular plans and points of action for the coming year.


Method Conclusion and Future directions

Making this graph in the first place takes quite a lot of effort (I think this one took me at least 20 hours, but I didn't time it, and I was also writing this blog at the same time, which took effort). But once it is made, it should be relatively easy to revisit  every year or several times a year to compare results with predictions, or add new decision or variables and think about how they would interact with your existing network.

Despite the fact that the backpropagation method from machine learning was the inspiration for this exercise, this is probably the least important step. By the time I had drawn the graph and written out all of the node footnotes, I already had a pretty good idea of what my priorities should be. Writing edge footnotes and calculating the backpropagation had somewhat diminishing returns compared to just constructing the graph. I was somewhat surprised that ranking of priorities from the backpropagation pretty closely matched my intuition and actual practice over the past few years, which suggests to me that this graph model with linear weights is actually a reasonably faithful way to mathematically represent my beliefs and priorities. Going into this exercise I thought there might be a real possibility that the math would give completely uninterpretable and useless results, but that doesn't seem to be the case.

There are a lot of extensions and improvements that could be tried, here are a few I thought of:

I could try playing around with normalization of the edge weights, like dividing all the weights of edges coming into a node by the sum of the absolute values of the incoming edge weights.

It's kind of weird that the edge weights are fixed parameters rather than learned parameters.

If other people find this approach to be useful, I would consider writing software to make it easier.


Worked out Example

See the figure near the top of the post for my graph for this year. Due to the high degree of uncertainty regarding the possibility of travel and socializing because of the ongoing pandemic, many of my conclusions are "reassess at the end of spring or summer", which is a little anticlimactic.

I couldn't get tables to render very well in blogger, so here's a link to a Google Sheet with all of the node and edge footnotes, as well as a conclusion.

https://docs.google.com/spreadsheets/d/1vmCISyB7Mc9tKoIoP1QoxPx0375ZrfREc8znUa6lejc/edit?usp=sharing


No comments:

Post a Comment