API

Contents

Public Interface

Graph creation

ComputationGraphs.ComputationGraphType

Structure use to store the computation graph.

Fields:

  • nodes::Vector{AbstractNode}: vector with all nodes in the graph in topological order: children always appear after parents.

  • children::Vector{Vector{Int}}: vector with all the children of each node

  • parents::Vector{Vector{Int}}: vector with all the parents of each node

  • validValue::BitVector: boolean vector, true indicates that node contains a valid value

  • `computewithancestors::Vector{FunctionWrapper}: vector of function to compute each node all the required ancestors

  • count::Vector{UInt}: number of times each node has been computed since last resetLog!

source
ComputationGraphs.@addMacro
@add graph expression

Macro to add a complex expression into a computation graph.

This macro "breaks" the complex expression to elementary subexpressions and add them all to the graph.

Parameters:

  • graph::ComputationGraph: graph where expression will be stored
  • expression::Expr: expression to be added to the graph

Returns:

  • Node::AbstractNode: graph node for the final expression

Example:

The following code provides two alternatives to create a computation graph to evaluate err = ||A *x -b ||^2

  1. without the @add macro julia using ComputationGraphs gr=ComputationGraph(Float32) A = variable(gr,3,4) x = variable(gr,4) b = variable(gr,3) Ax = *(gr,A,x) Axb = -(gr,Ax,b) err = norm2(gr,Axb) display(gr)
  2. without the @add macro julia using ComputationGraphs gr=ComputationGraph(Float32) A = @add gr variable(3,4) x = @add gr variable(4) b = @add gr variable(3) err = @add gr norm2(times(A,x)-b) display(gr)
source
ComputationGraphs.variableFunction
variable(graph,dims)
variable(graph,dims...)
variable(graph, value)

Creates a variable of the given dimension

Parameters:

  • graph::ComputationGraph: Computation graph where variable will be stored.
  • dims::NTuple{N,Int}: Desired dimension of the variable. An empty tuple () results in a scalar variable.
  • value::AbstractArray: Initial value for the variable, which implicitly defines its dimension. If value is a scalar, it is first converted to a 0-dimensional array using fill(value).

Returns:

  • node of the computation graph associated with the variable created
source
ComputationGraphs.constantFunction
constant(graph, value)

Creates a (constant) array equal to the given value.

Parameters:

  • graph::ComputationGraph: Computation graph where the array will be stored.
  • value::AbstractArray: Desired value for the array. If value is a scalar, it is first converted to a 0-dimensional array using fill(value).

Returns:

  • node of the computation graph associated with an array created
source
Base.zerosFunction
zeros(graph, dims)
zeros(graph, dims...)

Creates an array filled with 0's

Parameters:

  • graph::ComputationGraph: computation graph where the array will be stored
  • dims::NTuple{N,Int}: dimension of the array

Returns:

  • node of the computation graph associated with an array filled with zero(TypeValue)
source
Base.onesFunction
ones(graph, dims)
ones(graph, dims...)

Creates an array filled with 1's

Parameters:

  • graph::ComputationGraph: computation graph where array will be stored
  • dims::NTuple{N,Int}: dimension of the array

Returns:

  • node of the computation graph associated with an array filled with one(TypeValue)
source
Base.sizeFunction
size(node)
size(graph, node)

Returns a tuple with the size of the array associated with a node of a computation graph.

source
size(node, dim)
size(graph, node, dim)

Returns the size of the array associated with a node of a computation graph, along dimension dim.

source
Base.lengthFunction
length(node)
length(graph, node)

Returns the number of entries of the array associated with a node of a computation graph.

source

Number of nodes in the graph.

source
Base.similarFunction
similar(node)
similar(graph, node)

Creates an uninitialized array with the same type and size as the graph node.

source
Base.Multimedia.displayFunction
display(node)
display(nodes)

Display one node of a computation graph or a tuple of nodes

source
display(graph;topTimes=false)

Display the nodes of a computation graph.

When topTimes=true only displays the nodes with the largest total computation times (and hides information about parents/children).

source
display(graph,node;withParents=true)

When withParents=true shows the full expression needed compute a specific node, otherwise only shows the specific node (as in display(node)).

source

Operations supported

Missing docstring.

Missing docstring for LinearAlgebra.dot. Check Documenter's build log for details.

ComputationGraphs.expandColumnsFunction
expandColumns(a,rows,nRows)

Expands a vector a into a matrix A as follows: Given an n-vector a , returns an nRows x n matrix A with A[i,j] = a[j] if i==rows[j] else 0

source
ComputationGraphs.findMaxRowFunction
y=findMaxRow(A)

Creates an integer-valued vector y with as many entries as columns of A, where y[j] is equal to the index of the row of the largest entry in columns j of A.

source
ComputationGraphs.maxRowFunction

maxRow(A) computes a vector y with as many entries as columns of A, where y[j] is equal to the largest entry in columns j of A

source
Base.:-Function

-() unitary minus operator for a vector or matrix

source

a - b subtraction operator

source
Base.:*Function

a * b maps to times() or scalarTimes() depending on the sizes of the arguments

source
ComputationGraphs.ddlogisticFunction
ddlogistic(x)=(exp(-2x)-exp(-x)) /(1+exp(-x))^3

computes the 2nd-derivative of the logistics function of all entries of a vector or matrix

source
ComputationGraphs.dlogisticFunction
dlogistics(x)=exp(-x)/(1+exp(-x))^2

computes the derivative of the logistics function of all entries of a vector or matrix

source
Base.expFunction

exp() computes the exponential of all entries of a vector or matrix

source
Base.signFunction

sign() computes the sign function of all entries of a vector or matrix

source
Base.sqrtFunction

sqrt() takes the square root of all entries of a vector or matrix

source

Differentiation

ComputationGraphs.DFunction
Y = D(graph, F, P)
Y = D(graph, V, F, P)

Computes the partial derivative of the expression encoded in the node F with respect to the variable encoded in the node P, along the direction V. Formally, Y is a scalar/vector/matrix with the same size as the variable F, with its jth entry equal to

$Y[j] = \sum_i V[i] \nabla_{P[j]} F[i]$

where $\nabla_{X[j]} F[i]$ the partial derivative of the ith entry of F with respect to the jth entry of P.

The direction V can be omitted when F is a scalar, in which case

$Y[j] = \nabla_{P[j]} F$

Parameters

  • graph::ComputationGraph: Computation graph encoding the relevant expressions and variables.
  • V::Node: Direction with respect the partial derivative is computed. This node needs to have the same size as F.
  • F::Node: Expression to be differentiated.
  • P::NodeVariable: Variable with respect to F will be differentiated. This node must have been created using variable

Returns

  • Y::Node: Node that encodes the expression of the partial derivative (added to the graph if it was not already part of it.) This node will have the same size as P.
source
ComputationGraphs.hessianFunction
Y = hessian(graph, F, P, Q)

Computes the Hessian matrix of the expression encoded in the (scalar-valued) node F with respect to the variables encoded in the (vector-values) nodes P and Q. Formally, Y is a matrix with its (i,j)th entry equal to

$Y[i,j] = \nabla_{P[i]} \nabla_{Q[j]} F$

where $\nabla_{X}$ denotes partial derivative with respect to X.

Parameters

  • graph::ComputationGraph: Computation graph encoding the relevant expressions and variables.
  • F::Node: Expression to be differentiated.
  • P::NodeVariable: First variable with respect to F will be differentiated. This node must have been created using variable
  • Q::NodeVariable: Second variable with respect to F will be differentiated. This node must have been created using variable

Returns

  • Y::Node: Node that encodes the expression of the Hessian matrix (added to the graph if it was not already part of it.)
source

Graph computations

ComputationGraphs.set!Function
set!(graph,node,value)
set!(graph,nodes,values)

Update a variable node

  • set value of a variable node
  • mark all the children as having invalid values
source
Base.getFunction

Get the value of a node, performing whatever computations are needed.

source

Get the values of a list of node.

source
Base.copyto!Function
  • performing whatever computations are need for source node to be valid
  • copy value of source to destination node
  • mark all children of the destination node as having invalid values
source

Recipes

ComputationGraphs.gradDescent!Function
(;next_theta,eta,gradients) = gradDesc(graph; loss, theta)

Recipe used to performs the computations needed by the classical gradient descent algorithm to minimize a (scalar-valued) loss function $J( heta)$ by adjusting a set of optimization parameters $\theta$, according to

\[ \theta^+ = \theta - \eta\, \nabla_\theta J(\theta)\]

Parameters:

  • graph::ComputationGraph; Computation graph that is updated "in-place" by adding to it all the nodes needed to perform one step of gradient descent.

  • loss::Node: Scalar-valued computation node that corresponds to the loss function $J(\theta)$

  • theta::NamedTuple`: Named tuple with the variable nodes that correspond to the optimization parameters $\theta\$.

Returns: named tuple with

  • eta::Node: Scalar-valued variable node that can be used to set the learning rate $\eta$.

  • next_theta::Tuple: Named tuple with the computation nodes that holds the value $\theta^+$ of the optimization parameters after one gradient descent iteration

  • gradients::NamedTuple: Named tuple of the computation nodes that hold the value of the gradients of the loss function with respect to the different variables in theta.

Example:

using ComputationGraphs

graph = ComputationGraph(Float64)

# Define optimization parameters and loss function
A = variable(graph, 4, 3)
x = variable(graph, 3)
b = variable(graph, 4)
loss = @add graph norm2(times(A, x) - b)

# Call gradDescent! recipe
theta = (;x,)
(; next_theta, eta, gradients) = gradDescent!(graph; loss, theta)

# Set fixed parameters
set!(graph, A, [1.0 2.0 3.0; 4.0 5.0 6.0; 7.0 8.0 9.0; 10.0 11.0 12.0])
set!(graph, b, [2.0, 2.0, 2.0, 2.0])

# Set learning rate
set!(graph, eta, 0.001)

# Initialize optimization parameter
set!(graph, x, [1.0, 1.0, 1.0])
println("initial loss: ", get(graph,loss))

# Gradient descent loop
for i in 1:100
    compute!(graph, next_theta)       # compute next value of theta
    copyto!(graph, theta, next_theta) # execute update
end
println("final loss: ", get(graph,loss))
source
ComputationGraphs.adam!Function
(;  eta, beta1, beta2, epsilon,
    init_state, state, next_state,
    next_theta, gradients) = adam!(graph; loss, theta)

Recipe used to performs the computations needed by the Adam method to minimize a (scalar-valued) loss function $J(\theta)$ by adjusting a set of optimization parameters $\theta$.

The algorithm is described in Adam, using the comment just before section 2.1 for a more efficient implementation.

Parameters:

  • graph::ComputationGraph; Computation graph that is updated "in-place" by adding to it all the nodes needed to perform one step of gradient descent.
  • loss::Node: Scalar-valued computation node that corresponds to the loss function $J(\theta)$
  • theta::NamedTuple: Named tuple with the variable nodes that correspond to the optimization parameters $\theta\$.

Returns: named tuple the following nodes/tuples of nodes

  • eta: Scalar-valued variable node used to set the learning rate $\eta$.

  • beta1: Scalar-valued variable node used to set Adam's beta1 parameter.

  • beta2: Scalar-valued variable node used to set Adam's beta2 parameter.

  • epsilon: Scalar-valued variable node used to set Adam's epsilon parameter.

  • init_state, state, next_state: Adam's internal state initializer, current value, and next value, which include the iteration number and the 2 moments

  • next_theta::Tuple: value $\theta^+$ of the optimization parameters after one gradient descent iteration

  • gradients: gradients of the loss function with respect to the different variables in theta.

Example:

source
ComputationGraphs.denseChain!Function
(; inference, training, theta) = denseChain!(graph; 
    nNodes, inferenceBatchSize, trainingBatchSize, activation,loss)

Recipe used construct a graph for inference and training of a dense forward neural network.

    x[1]   = input
    x[2]   = activation(W[1] * x[1] + b[1])
    ...
    x[N-1] = activation(W[N-2] * x[N-2] + b[N-2])
    output = W[N-1] * x[N-1] + b[N-1]              # no activation in the last layer

with a loss function of

  loss = lossFunction(output-reference)

Parameters:

  • graph::ComputationGraph; Computation graph that is updated "in-place" by adding to it all the nodes needed to perform one step of gradient descent.

  • nNodes::Vector{Int}: Vector with the number of nodes in each layer, starting from the input and ending at the output layer.

  • inferenceBatchSize::Int=1: Number of inputs for each inference batch. When inferenceBatchSize=0 no nodes will be created for inference.

  • trainingBatchSize::Int=0: Number of inputs for each training batch. When trainingBatchSize=0 no nodes will be created for training.

  • activation::Function=ComputationGraphs.relu: Activation function. Use the identity function if no activation is desired.

  • loss::Symbol=:mse: Desired type of loss function, among the options:

      + :sse = sum of square error
      + :mse = mean-square error (i.e., sse normalized by the error size)
      + :huber = huber function on the error
      + :mhuber = huber function on the error, normalized by the error size

Returns: named tuple with the following fields

  • inference::NamedTuple: named tuple with the inference nodes:

      + `input` NN input for inference
      + `output` NN output for inference
      When `inferenceBatchSize=0` this tuple is returned empty

+ training::NamedTuple: named tuple with the training nodes: + input NN input for training + output NN output for training + reference NN desired output for training + loss NN loss for training

    When `trainingBatchSize=0` this tuple is returned empty
  • theta::NamedTuple: named tuple with the NN parameters (all the matrices W and b)

Example:

using ComputationGraphs, Random
graph=ComputationGraph(Float32)
(; inference, training, theta)=denseChain!(graph; 
        nNodes=[1,20,20,20,2], inferenceBatchSize=1, trainingBatchSize=3,
        activation=ComputationGraphs.relu, loss=:mse)

# (repeatable) random initialization of the weights
Random.seed!(0)
for k in eachindex(theta)
    set!(graph,theta[k],randn(Float32,size(theta[k])))
end

# Compute output for a random input
input=randn(Float32,size(inference.input))
set!(graph,inference.input,input)
output=get(graph,inference.output)
println("input = ",input,", output = ",output)

# compute loss for a batch of random inputs and desired outputs (reference)
input=randn(Float32,size(training.input))
reference=randn(Float32,size(training.reference))
set!(graph,training.input,input)
set!(graph,training.reference,reference)
loss=get(graph,training.loss)
println("inputs = ",input,", loss = ",loss)
source
ComputationGraphs.denseChainFunction
denseChain(TypeValue;
    nNodes=[],
    W=TypeArray{TypeValue,2}[],
    b=TypeArray{TypeValue,1}[],
    trainingBatchSize,
    inferenceBatchSize,
    activation=ComputationGraphs.relu,
    loss::Symbol=:sse,
    optimizer=NoOptimizer(),
    includeGradients=false,
    codeName="",
    parallel=false
)

Create computation graph for a dense forward neural network, defined as follows:

```
x[1]   = input

z[k]   = W[k] * x[k] + b[k]    for k in 1,...,K

x[k+1] = activation(z[k])            for k in 1,...,K-1

output = z[K]

loss = norm2(output-desiredOutput)

g[loss,W[k]] = gradient(loss,W[k]) for k in 1,...,K

g[loss,b[k]] = gradient(loss,b[k]) for k in 1,...,K
```

Parameters:

  • ::Type{TypeValue}: default type for the values of the computation graph nodes
  • nNodes::Vector{Int}=Int[]: vector with the number of nodes in each layer, starting from the input and ending at the output layer.
  • W::Vector{TypeArray{TypeValue,2}}=TypeArray{TypeValue,2}[]:
  • b::Vector{TypeArray{TypeValue,1}}=TypeArray{TypeValue,1}[]:
  • trainingBatchSize::Int:
  • inferenceBatchSize::Int:
  • activation::Function=ComputationGraphs.relu:
  • loss::Symbol=:sse:
  • optimizer::Op=NoOptimizer():
  • includeGradients::Bool=false:
  • codeName::String="":
  • parallel::Bool=false:

Returns: Named tuple with fields

  • graph
  • ioNodes
  • parameterNodes
  • trainingNodes
  • optimizerNodes
  • code
  • nOpsI2O

Number of forward operations to compute output:

  • z[k]:

    • # prods = sum(size(W[k],2)*(size(W[k],1)) for k in 1:K)
    • # sums = sum(size(W[k],2)*(size(W[k],1)) for k in 1:K)
  • x[k+1]:

    • # activation = sum(size(W[k],2) for k in 1:K-1)
source

Parallelization

ComputationGraphs.requestFunction
request(graph, node::Node)
request(graph, node::NTuple{Node})
request(graph, node::NamedTuple{Node})

Requests parallel evaluation of a node or a tuple of nodes.

Presumes a previous call to computeSpawn!(graph)

source
Base.waitFunction
wait(graph, node::Node)
wait(graph, node::NTuple{Node})
wait(graph, node::NamedTuple{Node})

Waits for the evaluation of a node or a tuple of nodes, after an appropriate computation request made using request(graph, node(s))

Presumes a previous call to computeSpawn!(graph)

source

Code generation

ComputationGraphs.CodeType

Structure used to generate dedicated code:

Fields

  • parallel::Bool=false:

      1) When `true` the `valid` flags are implemented with `Threads.Event`s,
          otherwise just a `Bool` 
      
      2) When `true` each node has a `Threads.Task` that computes the node
         as needed.
  • unrolled::Bool=false:

      1) When `true`, the code generated for `get` uses a single function with nested `if`    
         statements to compute nodes on demand. 
    
         This can lead to very large functions for big graphs. 
      
         Parallel computation is not supported in this mode.
    
      2) When `false`, each node has its own `compute` function that (recursively) calls the    
         parents' `compute` functions.
  • count::Bool=true: When true, the generated code includes counters for how many times each node's computation function has been called.

source

Internal functions

Graph definition

ComputationGraphs.@newnodeMacro
@newnode name{C1,...,C2}::outputShape
@newnode name{Nparameters,C1,...,C2}::outputShape

Macro used to create a new computation node type, where

  • C1,...,C2 represent the operands

  • Nparameters (optional) represents the number of parameters, which are fixed (as opposed to the operands)

  • outputShape is the size of the result and

    • can be a constant Tuple, as in

      @newnode norm2{x}::()

    • can use C1,...,C2 (especially their sizes), e.g.,

      @newnode mult!{A,B}::(size(C1,1),size(C2,2))

    • can use the values of the parameters, denoted by par1, par2, ...; as in

      @newnode timesAdjointOnes{1,x}::(size(x, 1), par1)

This macro then generates

""" 
Node of a computation graph used to represent the result of name()
"""
struct NodeName{TP<:Tuple,TPI<:Tuple,TV<:AbstractArray,TC} <: ComputationGraphs.AbstractNode
    id::Int
    parameters::TP
    parentIds::TPI
    value::TV
    compute!::TC
end

export name
name(graph::ComputationGraph,C1::T1,C2::T2,par1,par2,
) where {T1<:AbstractNode,T2<:AbstractNode} =
    push!(graph,NodeName,cg_name!,(par1,par2),(C1.id,C2.id),(c1.value,C2.value),outputShape)
source

Graph evaluation

ComputationGraphs.generateComputeFunctionsFunction

Generates a function that conditionally evaluates a node, using closure & enforcing type stability.

Each function will

  1. check if each parent need to be re-evaluated, if re-evaluates the parent and sets it's valid bit to true.
  2. always recomputes the function
    • without checking if it is needed (this should be checked by caller, to enable force=true)
    • without setting the valid bit), which is expected to be set by the calling function.
source
ComputationGraphs.compute_node!Function
compute_node!(node)
compute_node!(graph,node)
compute_node!(graph,id)

Call the function generated by generateComputeFunction that computes a single node.

source
ComputationGraphs.compute_with_ancestors!Function
compute_with_ancestors!(node)
compute_with_ancestors!(graph,node)
compute_with_ancestors!(graph,id)

Call the function generated by generateComputeFunction that computes a node and all its required parents.

source

Code generation

API index