Slow Matrices? Try Lists.

TL;DR

What do psychedelic triangles and chimeras have in common? Oddly the answer is “list data structures”. I discovered this while procrastinating from the side project that was my procrastination from my actual projects. Much to my surprise, shading triangles from vertex colors is faster with list data structures than with matrices and arrays. This despite the calculations involved being simple additions and multiplications on large atomic vectors.

As with everything in life there are caveats to this finding. It is primarily applicable to tall and skinny data sets, i.e. those with a very high ratio of observations to variables. It also requires some additional work over standard methods. Given that this type of data set is fairly common, it seems worth showing methods for working with list data structures. In my particular use case I saw speed improvements on the order of 2x to 4x.

If you are bored easily you might want to skip straight to the case study section of this post. If you like benchmarks and want an extensive review of list data structure manipulations, read on.

Matrices Are Slow?

Column Subsetting

Generally speaking matrices are amongst the R data structures that are fastest to operate on. After all, they are contiguous chunks of memory with almost no overhead. Unfortunately, they do come with a major limitation: any subset operation requires copying the data:

set.seed(1023)
n <- 1e5

M <- cbind(runif(n), runif(n), runif(n), runif(n))
D <- as.data.frame(M)
microbenchmark(M[,1:2], D[,1:2], times=25)
Unit: microseconds
     expr min     lq   mean median     uq  max neval
 M[, 1:2] 565 1651.1 2087.9 1844.2 2213.0 9322    25
 D[, 1:2]  16   23.5   49.2   42.1   71.6  105    25

The data frame subset is much faster because R treats data frame columns as independent objects, so referencing an entire column does not require a copy. But data frames have big performance problems of their own. Let’s see what happens if we select columns and rows:

idx <- sample(seq_len(nrow(M)), n/2)
microbenchmark(M[idx,1:2], D[idx,1:2], times=25)
Unit: microseconds
        expr   min    lq  mean median    uq   max neval
 M[idx, 1:2]   619  1263  1666   1600  1764  3246    25
 D[idx, 1:2] 10064 10601 11429  11080 11927 14299    25

And its even worse if we have duplicate row indices:

idx2 <- c(idx, idx[1])
microbenchmark(M[idx2,1:2], D[idx2,1:2], times=25)
Unit: microseconds
         expr   min    lq  mean median    uq   max neval
 M[idx2, 1:2]   615  1407  1645   1509  1922  2492    25
 D[idx2, 1:2] 31205 35578 43288  42699 47804 67112    25

Data frames have row names that are required to be unique and most of the row subsetting overhead comes from managing them.

Lists to the Rescue

With a little extra work you can do most of what you would normally do with a matrix or data frame with a list. For example, to select “rows” in a list of vectors, we use lapply to apply [ to each component vector:

L <- as.list(D)
str(L)
List of 4
 $ V1: num [1:100000] 0.249 0.253 0.95 0.489 0.406 ...
 $ V2: num [1:100000] 0.7133 0.4531 0.6676 0.651 0.0663 ...
 $ V3: num [1:100000] 0.87 0.783 0.714 0.419 0.495 ...
 $ V4: num [1:100000] 0.44475 0.90667 0.12428 0.00587 0.77874 ...
sub.L <- lapply(L[1:2], '[', idx)
str(sub.L)
List of 2
 $ V1: num [1:50000] 0.997 0.43 0.945 0.798 0.252 ...
 $ V2: num [1:50000] 0.539 0.287 0.567 0.848 0.56 ...
col_eq(sub.L, D[idx,1:2])  # col_eq compares values column by column
[1] TRUE

This results in the same elements selected as with data frame or matrix subsetting, except the result remains a list. You can see how col_eq compares objects in the appendix.

Despite the explicit looping over columns this list based subsetting is comparable to matrix subsetting in speed:

microbenchmark(lapply(L[1:2], '[', idx), M[idx,1:2], D[idx,1:2], times=25)
Unit: microseconds
                     expr  min    lq  mean median    uq   max neval
 lapply(L[1:2], "[", idx)  781  1338  1725   1686  2101  3006    25
              M[idx, 1:2] 1101  1230  1801   1450  2263  4607    25
              D[idx, 1:2] 8809 10720 12673  12826 13467 20697    25

And if we only index columns then the list is even faster than the data frame:

microbenchmark(L[1:2], M[,1:2], D[,1:2], times=25)
Unit: nanoseconds
     expr     min      lq    mean  median      uq     max neval
   L[1:2]     667     913    2538    1081    3738    8456    25
 M[, 1:2] 1558695 1742878 1975152 1868145 2075127 3037454    25
 D[, 1:2]   13966   15120   32273   18562   50256   76023    25

Patterns in List Based Analysis

Row Subsetting

As we saw previously the idiom for subsetting “rows” from a list of equal length vectors is:

lapply(L, '[', idx)

The lapply expression is equivalent to:

fun <- `[`
result <- vector("list", length(L))
for(i in seq_along(L)) result[[i]] <- fun(L[[i]], idx)

We use lapply to apply a function to each element of our list. In our case, each element is a vector, and the function is the subset operator [. In R, L[[i]][idx] and '['(L[[i]], idx) are equivalent as operators are really just functions that R recognizes as functions at parse time.

Column Operations

We can extend the lapply principle to other operators and functions. To multiply every column by two we can use:

microbenchmark(times=25,
  Matrix=  M * 2,
  List=    lapply(L, '*', 2)
)
Unit: microseconds
   expr min   lq mean median   uq   max neval
 Matrix 968 2350 3380   2517 2832 17960    25
   List 459  924 2001   2366 2530  3622    25

This will work for any binary function. For unary functions we just omit the additional argument:

lapply(L, sqrt)

We can also extend this to column aggregating functions:

microbenchmark(times=25,
  Matrix_sum=  colSums(M),
  List_sum=    lapply(L, sum),
  Matrix_min=  apply(M, 2, min),
  List_min=    lapply(L, min)
)
Unit: microseconds
       expr  min   lq  mean median    uq   max neval
 Matrix_sum  544  551  1151    561   566 15209    25
   List_sum  527  536   595    563   612   848    25
 Matrix_min 6148 8016 10207  10166 10869 18238    25
   List_min  972  988  1100   1028  1170  1629    25

In the matrix case we have the special colSums function that does all the calculations in internal C code. Despite that it is no faster than the list approach. And for operations without special column functions such as finding the minimum by column we need to use apply, at which point the list approach is much faster.

If we want to use different scalars for each column things get a bit more complicated, particularly for the matrix approach. The most natural way to do this with the matrix is to transpose and take advantage of column recycling. For lists we can use Map:

vec <- 2:5
microbenchmark(times=25,
  Matrix= t(t(M) * vec),
  List=   Map('*', L, vec)
)
Unit: microseconds
   expr  min   lq mean median   uq   max neval
 Matrix 3288 6794 7534   7178 8130 16447    25
   List  835 1128 2674   2912 3000 11654    25
col_eq(t(t(M) * vec), Map('*', L, vec))
[1] TRUE

Map is a close cousin of mapply. It calls its first argument, in this case the function *, with one element each from the subsequent arguments, looping through the argument elements. We use Map instead of mapply because it is guaranteed to always return a list, whereas mapply will simplify the result to a matrix. It is equivalent to:

f <- `*`
result <- vector('list', length(vec))
for(i in seq_along(L)) result[[i]] <- f(L[[i]], vec[[i]])

There are some faster ways to do this with matrices, such as:

microbenchmark(times=25,
  Matrix=  M %*% diag(vec),
  List=    Map('*', L, vec)
)
Unit: microseconds
   expr  min   lq mean median   uq   max neval
 Matrix 2992 4329 5257   4378 4867 13458    25
   List  812 2211 3067   3011 3090 11760    25
col_eq(M %*% diag(vec), Map('*', L, vec))
[1] TRUE

But even that is slower than the list approach, with the additional infirmities that it cannot be extended to other operators, and it works incorrectly if the data contains infinite values.

Map also allows us to operate pairwise on columns. Another way to compute the square of our data might be:

microbenchmark(times=25,
  Matrix= M * M,
  List=   Map('*', L, L)
)
Unit: microseconds
   expr min   lq mean median   uq   max neval
 Matrix 964 2311 3065   2449 2645 11396    25
   List 486 1058 2178   2502 2743  3477    25
col_eq(M * M, Map('*', L, L))
[1] TRUE

Row-wise Operations

A classic row-wise operation is to sum values by row. This is easily done with rowSums for matrices. For lists, we can use Reduce to collapse every column into one:

microbenchmark(times=25,
  Matrix= rowSums(M),
  List=   Reduce('+', L)
)
Unit: microseconds
   expr  min   lq mean median   uq   max neval
 Matrix 1701 2510 2902   2944 3161  4288    25
   List  713 1437 2169   1694 1986 14609    25
col_eq(rowSums(M), Reduce('+', L))
[1] TRUE

Much to my surprise Reduce is substantially faster, despite explicitly looping through the columns. From informal testing Reduce has an edge up to ten columns or so. This may be because rowSums will use long double to store the interim results of the row sums, whereas + will just add double values directly, with some loss of precision as a result.

Reduce collapses the elements of the input list by repeatedly applying a binary function to turn two list elements into one. Reduce is equivalent to:

result <- numeric(length(L[[1]]))
for(l in L) result <- result + l

An additional advantage of the Reduce approach is you can use any binary function for the row-wise aggregation.

List to Matrix and Back

R is awesome because it adopts some of the better features of functional languages and list based languages. For example, the following two statements are equivalent:

ML1 <- cbind(x1=L[[1]], y1=L[[2]], x2=L[[3]], y2=L[[4]])
ML2 <- do.call(cbind, L)

Proof:

col_eq(ML1, ML2)
[1] TRUE
col_eq(ML1, M)
[1] TRUE

The blurring of calls, lists, data, and functions allows for very powerful manipulations of list based data. In this case, do.call(cbind, LIST) will transform any list containing equal length vectors into a matrix, irrespective of how many columns it has.

To go in the other direction:

LM <- lapply(seq_len(ncol(M)), function(x) M[,x])
col_eq(L, LM)
[1] TRUE

You could use split but it is much slower.

More on do.call

List based analysis works particularly well with do.call and functions that use ... parameters:

microbenchmark(times=25,
  List=   do.call(pmin, L),
  Matrix= pmin(M[,1],M[,2],M[,3],M[,4])
)
Unit: milliseconds
   expr  min   lq mean median   uq  max neval
   List 2.90 3.23 3.80   3.30 3.36 13.5    25
 Matrix 5.52 7.32 8.05   7.51 7.88 19.5    25

The expression is both easier to type, and faster!

There are a couple of “gotchas” with do.call, particularly if you try to pass quoted language as part of the argument list (e.g. quote(a + b)). If you are just using normal R data objects then the one drawback that I am aware of is that the call stack can get messy if you ever invoke traceback() or sys.calls() from a function called by do.call. An issue related to this is that do.call calls that error can become very slow. So long as you are sure the do.call call is valid there should be no problem.

Remember that data frames are lists, so you can do things like:

do.call(pmin, iris[-5])

High Dimensional Data

Matrices naturally extend to arrays in high dimensional data. Imagine we wish to track triangle vertex coordinates in 3D space. We can easily set up an array structure to do this:

verts <- 3
dims <- 3
V <- runif(n * verts * dims)
tri.A <- array(
  V, c(n, verts, dims),
  dimnames=list(NULL, vertex=paste0('v', seq_len(verts)), dim=c('x','y','z'))
)
tri.A[1:2,,]
, , dim = x

      vertex
          v1    v2    v3
  [1,] 0.982 0.411 0.936
  [2,] 0.798 0.675 0.345

, , dim = y

      vertex
          v1    v2    v3
  [1,] 0.561 0.156 0.782
  [2,] 0.188 0.572 0.854

, , dim = z

      vertex
          v1    v2    v3
  [1,] 0.435 0.361 0.892
  [2,] 0.146 0.109 0.922

How do we do this with lists? Well, it turns out that you can make list-matrices and list-arrays. First we need to manipulate our data little as we created it originally for use with an array:

MV <- matrix(V, nrow=n)
tri.L <- lapply(seq_len(verts * dims), function(x) MV[,x])

Then, we just add dimensions:

dim(tri.L) <- tail(dim(tri.A), -1)
dimnames(tri.L) <- tail(dimnames(tri.A), -1)
tri.L
      dim
vertex x              y              z             
    v1 Numeric,100000 Numeric,100000 Numeric,100000
    v2 Numeric,100000 Numeric,100000 Numeric,100000
    v3 Numeric,100000 Numeric,100000 Numeric,100000
class(tri.L)
[1] "matrix"
typeof(tri.L)
[1] "list"

A chimera! tri.L is a list-matrix. While this may seem strange if you haven’t come across one previously, the list matrix and 3D array are essentially equivalent:

head(tri.A[,'v1','x'])
[1] 0.9825 0.7980 0.5111 0.0385 0.6194 0.1001
head(tri.L[['v1','x']])
[1] 0.9825 0.7980 0.5111 0.0385 0.6194 0.1001

An example operation might be to find the mean value of each coordinate for each vertex:

colMeans(tri.A)
      dim
vertex     x     y   z
    v1 0.499 0.501 0.5
    v2 0.500 0.500 0.5
    v3 0.500 0.500 0.5
apply(tri.L, 1:2, do.call, what=mean)
      dim
vertex     x     y   z
    v1 0.499 0.501 0.5
    v2 0.500 0.500 0.5
    v3 0.500 0.500 0.5

apply with the second argument set to 1:2 is equivalent to:

MARGIN <- 1:2
result <- vector('list', prod(dim(tri.L)[MARGIN]))
dim(result) <- dim(tri.L)[MARGIN]
fun <- do.call
for(i in seq_len(nrow(tri.L)))
  for(j in seq_len(ncol(tri.L)))
    result[i, j] <- fun(what=mean, tri.L[i, j])

We use do.call because tri.L[i, j] is a list, and we wish to call mean with the contents of the list, not the list itself. We could also have used a function like function(x) mean(x[[1]]) instead of do.call.

We’ll be using list matrices in the next section.

A Case Study

One of my side projects required me to implement barycentric coordinate conversions. You can think of barycentric coordinates of a point in a triangle as the weights of the vertices that cause the triangle to balance on that point.

We show here three different points (red crosses) with the barycentric coordinates associated with each vertex. The size of the vertex helps visualize the “weights”:

This is useful if we want to interpolate values from the vertices to points in the triangle. Here we interpolate a color for a point in a triangle from the colors of the vertices of that triangle:

The formula for converting cartesian coordinates to Barycentric is:

\[\lambda_1 = \frac{(y_2 - y_3)(x - x_3) + (x_3 - x_2)(y - y_3)}{(y_2 - y_3)(x_1 - x_3) + (x_3 - x_2)(y_1 - y_3)}\\ \lambda_2 = \frac{(y_3 - y_1)(x - x_3) + (x_1 - x_3)(y - y_3)}{(y_2 - y_3)(x_1 - x_3) + (x_3 - x_2)(y_1 - y_3)}\\ \lambda_3 = 1 - \lambda_1 - \lambda_2\]

You can see that whatever the data structure is, we will be using a lot of column subsetting for this calculation. We implement bary_A and bary_L to convert Cartesian coordinates to barycentric using array and list-matrix data structures respectively. Since both of these are direct translations of the above formulas, we relegate them to the code appendix. We will use them to shade every point of our triangle with the weighted color of the vertices.

First, the data (defined in the appendix):

str(point.L)  # every point we want to compute color
List of 2
 $ x: num [1:79946] 0 0.00251 0.00501 0.00752 0.01003 ...
 $ y: num [1:79946] 0.067 0.067 0.067 0.067 0.067 ...
vertex.L      # list-matrix!
    Data
V    x             y             r             g             b            
  v1 Numeric,79946 Numeric,79946 Numeric,79946 Numeric,79946 Numeric,79946
  v2 Numeric,79946 Numeric,79946 Numeric,79946 Numeric,79946 Numeric,79946
  v3 Numeric,79946 Numeric,79946 Numeric,79946 Numeric,79946 Numeric,79946

The vertex.L list-matrix has the x, y coordinates and the red, green, and blue color channel vales of the triangle associated with each point. In this particular case it happens to be the same triangle for every point, so we could have used a single instance of the triangle. The generalization needs to allow for many triangles with a small and varying number of points per triangle, and to handle that efficiently in R we need to match each point to its own triangle.

And the shading function:

v_shade_L <- function(p, v) {
  ## 1) compute barycentric coords
  bc <- bary_L(p, v)
  ## 2) for each point, weight vertex colors by bary coords
  clr.raw <- apply(v[, c('r','g','b')], 2, Map, f="*", bc)
  ## 3) for each point-colorchannel, sum the weighted values
  lapply(clr.raw, Reduce, f="+")
}

Step 1) is carried out by bary_L, which is a simple adaptation of the barycentric conversion formulas. Step 2) is the most complicated. To understand what’s going on we need to look at the two data inputs to apply, which are:

v[, c('r','g','b')]
    Data
V    r             g             b            
  v1 Numeric,79946 Numeric,79946 Numeric,79946
  v2 Numeric,79946 Numeric,79946 Numeric,79946
  v3 Numeric,79946 Numeric,79946 Numeric,79946
str(bc)
List of 3
 $ : num [1:79946] 1 0.997 0.995 0.992 0.99 ...
 $ : num [1:79946] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:79946] 0 0.00251 0.00501 0.00752 0.01003 ...

In:

clr.raw <- apply(v[, c('r','g','b')], 2, Map, f="*", bc)

We ask apply to call Map for each column of v[, c('r','g','b')]. The additional arguments f='*' and bc are passed on to Map such that, for the first column, the Map call is:

Map('*', v[,'r'], bc)

v[,'r'] is a list with the red color channel values for each of the three vertices for each of the points in our data:

str(v[,'r'])
List of 3
 $ v1: num [1:79946] 1 1 1 1 1 1 1 1 1 1 ...
 $ v2: num [1:79946] 1 1 1 1 1 1 1 1 1 1 ...
 $ v3: num [1:79946] 0 0 0 0 0 0 0 0 0 0 ...

bc is a list of the barycentric coordinates of each point which we will use as the vertex weights when averaging the vertex colors to point colors:

str(bc)
List of 3
 $ : num [1:79946] 1 0.997 0.995 0.992 0.99 ...
 $ : num [1:79946] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:79946] 0 0.00251 0.00501 0.00752 0.01003 ...

The Map call is just multiplying each color value by its vertex weight.

apply will repeat this over the three color channels giving us the barycentric coordinate-weighted color channel values for each point:

str(clr.raw)
List of 3
 $ r:List of 3
  ..$ v1: num [1:79946] 1 0.997 0.995 0.992 0.99 ...
  ..$ v2: num [1:79946] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ v3: num [1:79946] 0 0 0 0 0 0 0 0 0 0 ...
 $ g:List of 3
  ..$ v1: num [1:79946] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ v2: num [1:79946] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ v3: num [1:79946] 0 0.00251 0.00501 0.00752 0.01003 ...
 $ b:List of 3
  ..$ v1: num [1:79946] 1 0.997 0.995 0.992 0.99 ...
  ..$ v2: num [1:79946] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ v3: num [1:79946] 0 0.00251 0.00501 0.00752 0.01003 ...

Step 3) collapses the weighted values into (r,g,b) triplets for each point:

str(lapply(clr.raw, Reduce, f="+"))
List of 3
 $ r: num [1:79946] 1 0.997 0.995 0.992 0.99 ...
 $ g: num [1:79946] 0 0.00251 0.00501 0.00752 0.01003 ...
 $ b: num [1:79946] 1 1 1 1 1 1 1 1 1 1 ...

And with the colors we can render our fully shaded triangle (see appendix for plot_shaded):

color.L <- v_shade_L(point.L, vertex.L)
plot_shaded(point.L, do.call(rgb, color.L))

The corresponding function in array format looks very similar to the list one:

v_shade_A <- function(p, v) {
  ## 1) compute barycentric coords
  bc <- bary_A(p, v)
  ## 2) for each point, weight vertex colors by bary coords
  v.clrs <- v[,,c('r','g','b')]
  clr.raw <- array(bc, dim=dim(v.clrs)) * v.clrs
  ## 3) for each point-colorchannel, sum the weighted values
  rowSums(aperm(clr.raw, c(1,3,2)), dims=2)
}

For a detailed description of what it is doing see the appendix. The take-away for now is that every function in the array implementation is fast in the sense that all the real work is done in internal C code.

The data are also similar, although they are in matrix and array format instead of list and matrix-list:

str(point.A)
 num [1:79946, 1:2] 0 0.00251 0.00501 0.00752 0.01003 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:2] "x" "y"
str(vertex.A)
 num [1:79946, 1:3, 1:5] 0 0 0 0 0 0 0 0 0 0 ...
 - attr(*, "dimnames")=List of 3
  ..$     : NULL
  ..$ V   : chr [1:3] "v1" "v2" "v3"
  ..$ Data: chr [1:5] "x" "y" "r" "g" ...

Both shading implementations produce the same result:

color.A <- v_shade_A(point.A, vertex.A)
col_eq(color.L, color.A)
[1] TRUE

Yet, the list based shading is substantially faster:

microbenchmark(times=25,
  v_shade_L(point.L, vertex.L),
  v_shade_A(point.A, vertex.A)
)
Unit: milliseconds
                         expr  min   lq mean median   uq   max neval
 v_shade_L(point.L, vertex.L) 12.8 15.8 19.4   18.6 22.2  30.7    25
 v_shade_A(point.A, vertex.A) 58.2 68.2 76.5   71.9 80.1 156.2    25

We can get a better sense of where the speed differences are coming from by looking at the profile data using treeprof:

treeprof(v_shade_L(point.L, vertex.L))
Ticks: 877; Iterations: 46; Time Per: 22.30 milliseconds; Time Total: 1.026 seconds; Time Ticks: 0.877

                                  milliseconds
v_shade_L -------------------- : 22.30 -  0.00
    bary_L ------------------- : 11.77 - 11.77
    apply -------------------- :  6.69 -  0.00
    |   FUN ------------------ :  6.69 -  0.00
    |       mapply ----------- :  6.69 -  0.00
    |           <Anonymous> -- :  6.69 -  6.69
    lapply ------------------- :  3.84 -  0.00
        FUN ------------------ :  3.84 -  0.00
            f ---------------- :  3.84 -  3.84
treeprof(v_shade_A(point.A, vertex.A))
Ticks: 936; Iterations: 12; Time Per: 95.83 milliseconds; Time Total: 1.150 seconds; Time Ticks: 0.936

                                    milliseconds
v_shade_A ---------------------- : 95.83 -  0.00
    bary_A --------------------- : 60.92 - 60.10
    |   cbind ------------------ :  0.82 -  0.82
    rowSums -------------------- : 26.00 - 10.65
    |   is.data.frame ---------- : 15.36 -  0.00
    |       aperm -------------- : 15.36 -  0.10
    |           aperm.default -- : 15.25 - 15.25
    array ---------------------- :  8.91 -  8.91

Most of the time difference is in the bary_L vs bary_A functions that compute the barycentric coordinates. You can also tell that because almost all the time in the bary_* functions is “self” time (the second column in the profile data) that this time is all spent doing primitive / internal operations.

Step 2) takes about the same amount of time for both the list-matrix and array methods (the apply call for v_shade_L and the array call for v_shade_A, but step 3) is also much faster for the list approach because with the array approach we need to permute the dimensions so we can use rowSums (more details in the appendix).

Conclusion

If your data set is tall and skinny, list data structures can improve performance without compiled code or additional external dependencies. Most operations are comparable in speed whether done with matrices or lists, but key operations are faster and that make some tasks noticeably more efficient.

One trade-off is that the data manipulation can be more difficult with lists, but for code in a package the extra work may be justified. Additionally, this is not always true: patterns like do.call(fun, list) simplify tasks that can be awkward with matrices.

Questions? Comments? @ me on Twitter.

Appendix

Session Info

sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] treeprof_0.2.3       microbenchmark_1.4-4

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0        bookdown_0.7      later_0.7.3      
 [4] digest_0.6.18     rprojroot_1.3-2   mime_0.5         
 [7] R6_2.3.0          xtable_1.8-2      backports_1.1.2  
[10] magrittr_1.5      evaluate_0.10.1   blogdown_0.8     
[13] stringi_1.2.3     promises_1.0.1    data.table_1.11.4
[16] rmarkdown_1.10    tools_3.5.2       stringr_1.3.1    
[19] shiny_1.1.0       httpuv_1.4.4.1    xfun_0.3         
[22] yaml_2.2.0        compiler_3.5.2    htmltools_0.3.6  
[25] knitr_1.20       

Column Equality

col_eq <- function(x, y) {
  if(is.matrix(x)) x <- lapply(seq_len(ncol(x)), function(z) x[, z])
  if(is.matrix(y)) y <- lapply(seq_len(ncol(y)), function(z) y[, z])
  # stopifnot(is.list(x), is.list(y))
  all.equal(x, y, check.attributes=FALSE)
}

Barycentric Conversions

## Conversion to Barycentric Coordinates; List-Matrix Version

bary_L <- function(p, v) {
  det <-  (v[[2,'y']]-v[[3,'y']])*(v[[1,'x']]-v[[3,'x']]) +
          (v[[3,'x']]-v[[2,'x']])*(v[[1,'y']]-v[[3,'y']])

  l1 <- (
          (v[[2,'y']]-v[[3,'y']])*(  p[['x']]-v[[3,'x']]) +
          (v[[3,'x']]-v[[2,'x']])*(  p[['y']]-v[[3,'y']])
        ) / det
  l2 <- (
          (v[[3,'y']]-v[[1,'y']])*(  p[['x']]-v[[3,'x']]) +
          (v[[1,'x']]-v[[3,'x']])*(  p[['y']]-v[[3,'y']])
        ) / det
  l3 <- 1 - l1 - l2
  list(l1, l2, l3)
}

## Conversion to Barycentric Coordinates; Array Version

bary_A <- function(p, v) {
  det <- (v[,2,'y']-v[,3,'y'])*(v[,1,'x']-v[,3,'x']) +
         (v[,3,'x']-v[,2,'x'])*(v[,1,'y']-v[,3,'y'])

  l1 <- (
          (v[,2,'y']-v[,3,'y']) * (p[,'x']-v[,3,'x']) +
          (v[,3,'x']-v[,2,'x']) * (p[,'y']-v[,3,'y'])
        ) / det
  l2 <- (
          (v[,3,'y']-v[,1,'y']) * (p[,'x']-v[,3,'x']) +
          (v[,1,'x']-v[,3,'x']) * (p[,'y']-v[,3,'y'])
        ) / det
  l3 <- 1 - l1 - l2
  cbind(l1, l2, l3)
}

In reality we could narrow some of the performance difference between these two implementations by avoiding repeated subsets of the same data. For example, v[,3,'y'] is used several times, and we could have saved that subset to another variable for re-use.

Plotting

plot_shaded <- function(p, col, v=vinit) {
  par(bg='#EEEEEE')
  par(xpd=TRUE)
  par(mai=c(.25,.25,.25,.25))
  plot.new()

  points(p, pch=15, col=col, cex=.2)
  polygon(v[,c('x', 'y')])
  points(
    vinit[,c('x', 'y')], pch=21, cex=5,
    bg=rgb(apply(v[, c('r','g','b')], 2, unlist)), col='white'
  )
}

Array Shading

v_shade_A <- function(p, v) {
  ## 1) compute barycentric coords
  bc <- bary_A(p, v)
  ## 2) for each point, weight vertex colors by bary coords
  v.clrs <- v[,,c('r','g','b')]
  clr.raw <- array(bc, dim=dim(v.clrs)) * v.clrs
  ## 3) for each point-colorchannel, sum the weighted values
  rowSums(aperm(clr.raw, c(1,3,2)), dims=2)
}

There are two items that may require additional explanation in v_shade_A. First, as part of step 2) we use:

array(bc, dim=dim(v.clrs)) * v.clrs

Our problem is that we need to multiply the vertex weights in bc with the v.clrs color channel values, but they do not conform:

dim(bc)
[1] 79946     3
dim(v.clrs)
[1] 79946     3     3

Since we use the same weights for each color channel, it is just a matter of recycling the bc matrix once for each channel. Also, since the color channel is the last dimension we can just let array recycle the data to fill out the product of the vertex data dimensions:

dim(array(bc, dim=dim(v.clrs)))
[1] 79946     3     3

This is reasonably fast, but not completely ideal since it does mean we need to allocate a memory chunk the size of v.clrs. Ideally R would internally recycle the bc matrix as it recycles vectors in situations like matrix(1:4, 2) * 1:2, but that is not how it is.

The other tricky bit is part of step 3):

rowSums(aperm(clr.raw, c(1,3,2)), dims=2)

We ask rowSums to preserve the first two dimensions of the array with dims=2. The other dimensions are collapsed by summing the values that overlap in the first two. Unfortunately, we can only specify adjacent dimensions to preserve with rowSums. As a result we permute the data with aperm so that the point and color channel dimensions can be the first two and thus be preserved.

There may be clever ways to better structure the data to minimize the number of copies we make.

Data

## Define basic triangle vertex position and colors

height <- sin(pi/3)
v.off <- (1 - height) / 2
vinit <- cbind(
  x=list(0, .5, 1), y=as.list(c(0, height, 0) + v.off),
  r=list(1, 1, 0), g=list(0,1,1), b=list(1,0,1)
)
## Sample points within the triangles

p1 <- list(x=.5, y=tan(pi/6) * .5 + v.off)
p2 <- list(x=.5, y=sin(pi/3) * .8 + v.off)
p3 <- list(x=cos(pi/6)*sin(pi/3), y=sin(pi/3) * .5 + v.off)

## Generate n x n sample points within triangle; here we also expand the
## vertex matrix so there is one row per point, even though all points are
## inside the same triangle.  In our actual use case, there are many
## triangles with a handful of points within each triangle.

n <- 400
rng.x <- range(unlist(vinit[,'x']))
rng.y <- range(unlist(vinit[,'y']))
points.raw <- expand.grid(
  x=seq(rng.x[1], rng.x[2], length.out=n),
  y=seq(rng.y[1], rng.y[2], length.out=n),
  KEEP.OUT.ATTRS=FALSE
)
vp <- lapply(vinit, '[', rep(1, nrow(points.raw)))
dim(vp) <- dim(vinit)
dimnames(vp) <- dimnames(vinit)

## we're going to drop the oob points for the sake of clarity, one
## nice perk of barycentric coordinates is that negative values indicate
## you are out of the triangle.

bc.raw <- bary_L(points.raw, vp)
inbounds <- Reduce('&', lapply(bc.raw, '>=', 0))

## Make a list-matrix version of the data

point.L <- lapply(points.raw, '[', inbounds)
vertex.L <- lapply(vp, '[', inbounds)
dim(vertex.L) <- dim(vp)
dimnames(vertex.L) <- list(V=sprintf("v%d", 1:3), Data=colnames(vp))

## Generate an array version of the same data

point.A <- do.call(cbind, point.L)
vertex.A <- array(
  unlist(vertex.L), c(sum(inbounds), nrow(vertex.L), ncol(vertex.L)),
  dimnames=c(list(NULL), dimnames(vertex.L))
)
Brodie Gaslam Written by:

Brodie Gaslam is a hobbyist programmer based on the US East Coast.