[]
        
(Showing Draft Content)

Formula Parser

GcExcel provides GrapeCity.Documents.Excel.Expressions library which allows you to parse formula expressions. The formula expressions are exposed at semantic model level so that you can create, visit and modify the formulas by using syntax tree. The FormulaSyntaxTree class represents a formula and is the entry point for formula expressions API.

Syntax Tree

The syntax tree represents semantic model of formulas. The Parse method of FormulaSyntaxTree class can be used to get syntax tree from text. However, the text should not start with "=" and should not be surrounded with "{= }". The Root property of FormulaSyntaxTree class can be used to get the root element of syntax tree. An empty syntax tree can be created by using FormulaSyntaxTree constructor.

Refer to the following example code to generate a formula with syntax tree.

//create a new workbook
var workbook = new GrapeCity.Documents.Excel.Workbook();

// Build syntax tree
var multiply = new OperatorNode(OperatorKind.Multiply);
var a1 = new Reference
{
    Row = 0,
    Column = 0
};
var a2 = new Reference
{
    Row = 1,
    Column = 0
};
multiply.Children.Add(new ReferenceNode(a1));
multiply.Children.Add(new ReferenceNode(a2));

var tree = new FormulaSyntaxTree { Root = multiply };

// Generates A1*A2
workbook.ActiveSheet.Range["A1"].Value = "'=" + tree.ToString();

//save to an excel file
workbook.Save("generateformula.xlsx");

Syntax Node

The SyntaxNode class represents a node in the syntax tree. The Children property can be used to get children of a non-terminal node. If the type of syntax node is a terminal node, then this collection is read-only. Similar to syntax tree, the Parse method of SyntaxNode class can be used get syntax node from text. An empty syntax node can be created by using SyntaxNode constructor.

Refer to the following example code to parse formula, modify the syntax tree by replacing the child of syntax node and convert it to a string.

//create a new workbook
var workbook = new GrapeCity.Documents.Excel.Workbook();

var originalFormula = @"LET(AppUpTime,NOW()-DATE(2020,4,17)+366, YEAR(AppUpTime)-1900-1 & "" years""";

// Replace NOW() with fixed date

// Get syntax tree
var syntaxTree = FormulaSyntaxTree.Parse(originalFormula);

// Find
var nowFunction = new FunctionNode("NOW");

// Replacement
var valentine2021 = new FunctionNode("DATE");
valentine2021.Children.Add(new NumberNode(2021));
valentine2021.Children.Add(new NumberNode(2));
valentine2021.Children.Add(new NumberNode(14));

// Find and replace
void replaceNode(SyntaxNode lookIn, SyntaxNode find, SyntaxNode replacement)
{
    var children = lookIn.Children;

    for (var i = 0; i < children.Count; i++)
    {
        var child = children[i];
        if (child.Equals(find))
        {
            children[i] = replacement;
        }
        else
        {
            replaceNode(child, find, replacement);
        }
    }
}

replaceNode(syntaxTree.Root, nowFunction, valentine2021);

// Output original and replaced
var sheet1 = workbook.ActiveSheet;
sheet1.Range["A1"].Value = "Original";
sheet1.Range["A2"].Value = "'=" + originalFormula.ToString();
sheet1.Range["A3"].Value = "Replaced";
sheet1.Range["A4"].Value = "'=" + syntaxTree.ToString();

// Arrange
sheet1.Range["A:A"].EntireColumn.AutoFit();

//save to an excel file
workbook.Save("modifyformula.xlsx");

Parse and Unparse Options

The ParseContext and UnparseContext classes contain options for converting strings to FormulaSyntaxTree and vice versa respectively. The BaseRow and BaseColumn properties can be used to specify the location of formula and IsR1C1 property can be used to specify the reference style.

Refer to the following example code to specify base row, base column and R1C1 reference style in options.

//create a new workbook
var workbook = new GrapeCity.Documents.Excel.Workbook();

// Convert R1C1 to A1
var r1c1Formula = "R1C:R8C[4]*9";
// At H2
var formulaRow = 1;
var formulaColumn = 7;

// Parse
var r1c1Option = new ParseContext { IsR1C1 = true };
var syntaxTree = FormulaSyntaxTree.Parse(r1c1Formula, r1c1Option);

// ToString
// Specify BaseRow and BaseColumn in a1Option.
// Because row and column are absolute index in A1 format.
var a1Option = new UnParseContext
{
    BaseColumn = formulaColumn,
    BaseRow = formulaRow
};
var converted = syntaxTree.ToString(a1Option);

// Output
var sheet1 = workbook.ActiveSheet;
sheet1.Range["A1"].Value = "Original formula (at H2)";
sheet1.Range["A2"].Value = "'=" + r1c1Formula.ToString();
sheet1.Range["A3"].Value = "Converted";
sheet1.Range["A4"].Value = "'=" + converted.ToString();

// Arrange
sheet1.Range["A:A"].EntireColumn.AutoFit();

//save to an excel file
workbook.Save("parseandformatoptions.xlsx");

Refer to the following example code to parse formula and then print the syntax tree.

//create a new workbook
var workbook = new GrapeCity.Documents.Excel.Workbook();

const string Formula = "RAND()>0.5+0.001";

// Get syntax tree
var syntaxTree = FormulaSyntaxTree.Parse(Formula);

// Flatten nodes
var displayItems = new List<(string TypeName, int IndentLevel, string Content)>();

void flatten(SyntaxNode node, int level)
{
    displayItems.Add((node.GetType().Name, level, node.ToString()));
    foreach (var child in node.Children)
    {
        flatten(child, level + 1);
    }
}

flatten(syntaxTree.Root, 0);

// Output
var sheet1 = workbook.Worksheets["Sheet1"];
sheet1.ShowRowOutline = false;
sheet1.OutlineColumn.ColumnIndex = 1;
sheet1.OutlineColumn.CollapseIndicator = new ImageSource(GetResourceStream("decreaseIndicator.png"), ImageType.PNG);
sheet1.OutlineColumn.ExpandIndicator = new ImageSource(GetResourceStream("increaseIndicator.png"), ImageType.PNG);

// Header
sheet1.Range["A1"].Value = "Formula";
sheet1.Range["B1"].Value = "Syntax node";
sheet1.Range["C1"].Value = "Part";

// Values
sheet1.Range["A2"].Value = "'=" + Formula;
for (var i = 0; i < displayItems.Count; i++)
{
    var item = displayItems[i];
    var text = "'" + item.TypeName;

    sheet1.Range[i + 1, 1].Value = text;
    sheet1.Range[i + 1, 1].IndentLevel = item.IndentLevel;
    sheet1.Range[i + 1, 2].Value = "'" + item.Content;
}

// Arrange
sheet1.Range["A:C"].EntireColumn.AutoFit();
sheet1.Range["B:B"].EntireColumn.ColumnWidthInPixel += 40;

//save to an excel file
workbook.Save("printformulasyntax.xlsx");

Other Classes in GrapeCity.Documents.Excel.Expressions Library

The ReferenceNode class represents a reference expression in the syntax tree.

The Reference class represents a range reference in formula. The reference can be across a cell, range, cross-worksheet, cross-worksheet 3D or cross-workbook.

Note: If a row or column index is relative, BaseRow or BaseColumn properties should be used to convert to absolute index.

The WorkbookReference class is an immutable class which represents a reference to an external workbook by name or local file path. If the workbook reference is from file path, the BaseDirectory property contains the directory information.

Note: The path separator is platform specific and affects the result of workbook reference. For example, 'C:\Temp\[Book1.xlsx]Sheet1'!A2 is a valid reference on Windows but invalid on Linux.

For example, the parsed object for  a workbook referenced by name: [Book1]Sheet1!A2 will look like below:

new ReferenceNode(
    new Reference {
        Workbook=WorkbookReference.FromName("Book1"),
        WorksheetName="Sheet1",
        Row=1,
        Column=0
    }
);

The parsed object for a workbook referenced by file path: 'C:\Temp\[Book1.xlsx]Sheet1'!A2 will look like below:

new ReferenceNode(
    new Reference {
        Workbook=WorkbookReference.FromFilePath(@"C:\Temp\Book1.xlsx"),
        WorksheetName="Sheet1",
        Row=1,
        Column=0
    }
)

The parsed object for a workbook referenced from a web URI will look like below:

var worbookRef = WorkbookReference.FromUri( "https://somesite.com/files/sample.xlsx");
var model = new ReferenceNode(
    new Reference
    {
        Workbook = worbookRef,
        WorksheetName = "sheet1",
        Row = 8,
        Column = 1,
    }
);

The FunctionNode class represents a function invocation expression in the syntax tree.

For example, the parsed object for Excel formula: COUNTIF(A:A,"*?") will look like below:

new FunctionNode("COUNTIF") {
    Children = {
        new ReferenceNode(
            new Reference {
                HasRow=false, LastColumn=0
            }
        ),
        new TextNode("*?")
    }
};

The NameNode class represents the name in a syntax tree.

For example, the parsed object for a workbook referenced by name: '[BuildingSales]JanIn2021'!RawData will look like below:

new NameNode("RawData", WorkbookReference.FromName("BuildingSales"), "JanIn2021", null);

The parsed object for a workbook referenced by file path: 'E:\[BuildingSales.xlsx]JanIn2021'!RawData will look like below:

new NameNode("RawData", WorkbookReference.FromFilePath(@"E:\BuildingSales.xlsx"), "JanIn2021", null);

The ErrorNode class represents an error literal node in the syntax tree. The following error types are not supported:

  • #BLOCKED!

  • #CONNECT!

  • #FIELD!

  • #UNKNOWN!

  • #REF! error is parsed to ReferenceNode

The ArrayNode class represents an array literal in the syntax tree. There are following array constraints:

  • The length of array must be > 0

  • Elements can be Double, String, Boolean or CalcError. Primitive number types are converted to double implicitly.

  • The lower bound of each ranks must be 0

  • The array and Elements can't be null

To know more about other classes, please refer GrapeCity.Documents.Excel.Expressions API documentation.

Limitations

  • GetHashCode method of FormulaSyntaxTree and SyntaxNode class are not supported. They return constant values because all fields are mutable.

  • GcExcel does not support resolving workbook index defined in OpenXML or JSON file storage. They are treated as workbook reference by name.