Optimize graph#
Change topology (model isomorphism)#
These optimizations update the topology of the computation graph however they do not change its mathematical properties (i.e. for the same output the model before and after optimization will provide the same result), that is what we call model isomorphism.
Fuse MatMul & added#
ONNX graph can export Dense
/ FC
operator as two operator, MatMul
and Add
. This recipie replace these two operator and create a FC
operator, reusing the Producers attached to the MatMul
and Add
operator.
Tiling#
Proposed implementation#
Graph transformation:
Scheduling:
sequenceDiagram autonumber Stripe->>ConvStripe: inputsReq = convLoadBufferIn()<br/>memTransferWait(inputsReq)<br/>inputsReq = convLoadBufferIn() ConvStripe->>Unstripe: outputsReq = bufferToMemTransfer2D() Stripe->>ConvStripe: memTransferWait(inputsReq)<br/>inputsReq = convLoadBufferIn() ConvStripe->>Unstripe: memTransferWait(outputsReq)<br/>outputsReq = bufferToMemTransfer2D() Stripe->>ConvStripe: memTransferWait(inputsReq)<br/>inputsReq = convLoadBufferIn() ConvStripe->>Unstripe: memTransferWait(outputsReq)<br/>outputsReq = bufferToMemTransfer2D() Stripe->>ConvStripe: memTransferWait(inputsReq)<br/>inputsReq = convLoadBufferIn() ConvStripe->>Unstripe: memTransferWait(outputsReq)<br/>outputsReq = bufferToMemTransfer2D() Stripe->>ConvStripe: memTransferWait(inputsReq) ConvStripe->>Unstripe: memTransferWait(outputsReq)<br/>outputsReq = bufferToMemTransfer2D()<br/>memTransferWait(outputsReq)
gantt dateFormat s axisFormat %S title Scheduling Stripe_1 :crit, s1, 0, 2.05s Stripe_2 :s1b, after s1, 2s ConvStripe_1 :crit, c1, after s1, 3s Unstripe(1) :crit, u1, after c1, 0.05s Unstripe(1) :u1b, after u1, 1.5s Stripe_2 :crit, s2, after u1, 0.05s Stripe_3 :s2b, after s2, 2s ConvStripe_2 :crit, c2, after s2, 3s Unstripe(2) :crit, u2, after c2, 0.05s Unstripe(2) :u2b, after u2, 1.5s Stripe_3 :crit, s3, after u2, 0.05s Stripe_4 :s3b, after s3, 2s ConvStripe_3 :crit, c3, after s3, 3s Unstripe(3) :crit, u3, after c3, 0.05s Unstripe(3) :u3b, after u3, 1.5s Stripe_4 :crit, s4, after u3, 0.05s Stripe_5 :s4b, after s4, 2s ConvStripe_4 :crit, c4, after s4, 3s Unstripe(4) :crit, u4, after c4, 0.05s Unstripe(4) :u4b, after u4, 1.5s Stripe_5 :crit, s5, after u4, 0.05s ConvStripe_5 :crit, c5, after s5, 3s Unstripe(5) :crit, u5, after c5, 1.55s
Multi-layer spatial tiling#
Goal: tile spatially multiple layers.
Proposed method:
Specify the required tile’s position and size at some place in the block;
Propagate backward the required spatial tile’s position and size (with a mechanism similar to receptive field in N2D2);
Create the tiling operators and duplicate the subgraph.
When computing tile sizes, Pad
operators must be handled specifically. Only edge tiles should keep the padding corresponding to the position of the tile on edge.
An offset may be required on the final relative tile’s position and size to take into account dimensions reduction due to the convolution.