Comments (12)
I want to generate Abstract Syntax Tree (AST) of a mathematical equation from my MathGrammar.g4 antlr file. I want a AST without grammar tags in the tree only I want the equation components in the tree. suppose, I have an equation x^2+y^2=z^2, so I want tree in this style
=
---(+)---(^)
-------------(x)
-------------(2)
---------(^)
-------------(y)
-------------(2)
---(^)
------(z)
------(2)
(=(+(^(x)(2))(^(y)(2)))(^(z)(2)))
How can I do it in inmemantlr system?
from inmemantlr.
Hi @sjiya, I have renamed this issue ;). Please always feel free to create new issues that match your problem description. The inmemantlr tree is pretty generic and contains all the information you are looking for regarding the grammar rules and the matched tokens. However, if you want to obtain the reduced parse tree you are asking for, I would execute the following steps:
- Generate the inmemantlr parse tree as described in the documentation
- Implement a parse tree processor that translates the inmemantlr parse tree to your reduced representation. You can find an example here. I provided a more detailed explanation about parse tree processors in the documentation. You might also want to checkout the parse tree processor test-cases.
from inmemantlr.
Thanks for prompt reply.
plz check attached code and generated tree. I am unable to print AST from this code. I would appreciate if you suggest changes in my code.
TreeTest.zip
from inmemantlr.
Thanks @sjiya, would it be also possible for you to share the grammar you have used in the example?
from inmemantlr.
I am attaching grammar file here.
LG.zip
from inmemantlr.
from inmemantlr.
Hi @sjiya, your code seems works on my machine ... this is the output:
Tree Node Size: 64
graph {
node [fontname=Helvetica,fontsize=11];
edge [fontname=Helvetica,fontsize=10];
n1 [label="root",shape=ellipse];
n2 [label="math",shape=ellipse];
n3 [label="relation",shape=ellipse];
n4 [label="relation",shape=ellipse];
n5 [label="expr",shape=ellipse];
n6 [label="additive",shape=ellipse];
n7 [label="mp",shape=ellipse];
n8 [label="unary",shape=ellipse];
n9 [label="postfix",shape=ellipse];
n10 [label="exp",shape=ellipse];
n11 [label="comp",shape=ellipse];
n12 [label="atom",shape=ellipse];
n11 -- n12;
n10 -- n11;
n9 -- n10;
n8 -- n9;
n7 -- n8;
n6 -- n7;
n5 -- n6;
n4 -- n5;
n3 -- n4;
n13 [label="relation",shape=ellipse];
n14 [label="expr",shape=ellipse];
n15 [label="additive",shape=ellipse];
n16 [label="mp",shape=ellipse];
n17 [label="unary",shape=ellipse];
n18 [label="postfix",shape=ellipse];
n19 [label="exp",shape=ellipse];
n20 [label="comp",shape=ellipse];
n21 [label="frac",shape=ellipse];
n22 [label="expr",shape=ellipse];
n23 [label="additive",shape=ellipse];
n24 [label="mp",shape=ellipse];
n25 [label="unary",shape=ellipse];
n26 [label="unary",shape=ellipse];
n27 [label="postfix",shape=ellipse];
n28 [label="exp",shape=ellipse];
n29 [label="comp",shape=ellipse];
n30 [label="atom",shape=ellipse];
n29 -- n30;
n28 -- n29;
n27 -- n28;
n26 -- n27;
n31 [label="postfix",shape=ellipse];
n32 [label="exp",shape=ellipse];
n33 [label="comp",shape=ellipse];
n34 [label="atom",shape=ellipse];
n33 -- n34;
n32 -- n33;
n31 -- n32;
n26 -- n31;
n35 [label="postfix",shape=ellipse];
n36 [label="exp",shape=ellipse];
n37 [label="comp",shape=ellipse];
n38 [label="func",shape=ellipse];
n39 [label="expr",shape=ellipse];
n40 [label="additive",shape=ellipse];
n41 [label="additive",shape=ellipse];
n42 [label="mp",shape=ellipse];
n43 [label="unary",shape=ellipse];
n44 [label="postfix",shape=ellipse];
n45 [label="exp",shape=ellipse];
n46 [label="exp",shape=ellipse];
n47 [label="comp",shape=ellipse];
n48 [label="atom",shape=ellipse];
n47 -- n48;
n46 -- n47;
n45 -- n46;
n49 [label="atom",shape=ellipse];
n45 -- n49;
n44 -- n45;
n43 -- n44;
n42 -- n43;
n41 -- n42;
n40 -- n41;
n50 [label="additive",shape=ellipse];
n51 [label="mp",shape=ellipse];
n52 [label="unary",shape=ellipse];
n53 [label="postfix",shape=ellipse];
n54 [label="exp",shape=ellipse];
n55 [label="comp",shape=ellipse];
n56 [label="atom",shape=ellipse];
n55 -- n56;
n54 -- n55;
n53 -- n54;
n52 -- n53;
n51 -- n52;
n50 -- n51;
n40 -- n50;
n39 -- n40;
n38 -- n39;
n37 -- n38;
n36 -- n37;
n35 -- n36;
n26 -- n35;
n25 -- n26;
n24 -- n25;
n23 -- n24;
n22 -- n23;
n21 -- n22;
n57 [label="expr",shape=ellipse];
n58 [label="additive",shape=ellipse];
n59 [label="mp",shape=ellipse];
n60 [label="unary",shape=ellipse];
n61 [label="postfix",shape=ellipse];
n62 [label="exp",shape=ellipse];
n63 [label="comp",shape=ellipse];
n64 [label="atom",shape=ellipse];
n63 -- n64;
n62 -- n63;
n61 -- n62;
n60 -- n61;
n59 -- n60;
n58 -- n59;
n57 -- n58;
n21 -- n57;
n20 -- n21;
n19 -- n20;
n18 -- n19;
n17 -- n18;
n16 -- n17;
n15 -- n16;
n14 -- n15;
n13 -- n14;
n3 -- n13;
n2 -- n3;
n1 -- n2;
}
However, I noticed that you are using this fragment in your code:
try {
sgrammar = new FileInputStream("/home/sharaf/antlr4/LatexParser/src/LG.g4");
sgrammarcontent = FileUtils.getStringFromStream(sgrammar);
}finally {}
I would suggest to catch the exception and (at least) print its in the catch block. Maybe it couldn't find the .g4.
file so that sgrammarcontent
is empty but since you don't catch the exception here, the program continues using an empty grammar.
from inmemantlr.
I am getting the same output like yours. but you can observe in your output that equation elements are not appears in the tree. For example, x, =, -, b, \pm, .. are not in the output.
Plz find the attached code that is generating AST.
AST.zip
from inmemantlr.
Hi @sjiya, oh I see. Yes, per default, Terminals are not included in the parse tree. You basically have two possiblities:
- Pass
true
to the constructor of the default tree listenerDefaultTreeListener t = new DefaultTreeListener(true);
so that terminals are getting included
or - Use the
getEidx
,getSidx
members of the parse tree nodes (leafs) in order to retrieve the terminal from the string:
When you go for option 1, you should get the following output `
graph {
node [fontname=Helvetica,fontsize=11];
edge [fontname=Helvetica,fontsize=10];
n1 [label="root",shape=ellipse];
n2 [label="math",shape=ellipse];
n3 [label="relation",shape=ellipse];
n4 [label="relation",shape=ellipse];
n5 [label="expr",shape=ellipse];
n6 [label="additive",shape=ellipse];
n7 [label="mp",shape=ellipse];
n8 [label="unary",shape=ellipse];
n9 [label="postfix",shape=ellipse];
n10 [label="exp",shape=ellipse];
n11 [label="comp",shape=ellipse];
n12 [label="atom",shape=ellipse];
n13 [label="x",shape=box];
n12 -- n13;
n11 -- n12;
n10 -- n11;
n9 -- n10;
n8 -- n9;
n7 -- n8;
n6 -- n7;
n5 -- n6;
n4 -- n5;
n3 -- n4;
n14 [label="=",shape=box];
n3 -- n14;
n15 [label="relation",shape=ellipse];
n16 [label="expr",shape=ellipse];
n17 [label="additive",shape=ellipse];
n18 [label="mp",shape=ellipse];
n19 [label="unary",shape=ellipse];
n20 [label="postfix",shape=ellipse];
n21 [label="exp",shape=ellipse];
n22 [label="comp",shape=ellipse];
n23 [label="frac",shape=ellipse];
n24 [label="\frac",shape=box];
n23 -- n24;
n25 [label="\{",shape=box];
n23 -- n25;
n26 [label="expr",shape=ellipse];
n27 [label="additive",shape=ellipse];
n28 [label="mp",shape=ellipse];
n29 [label="unary",shape=ellipse];
n30 [label="\-",shape=box];
n29 -- n30;
n31 [label="unary",shape=ellipse];
n32 [label="postfix",shape=ellipse];
n33 [label="exp",shape=ellipse];
n34 [label="comp",shape=ellipse];
n35 [label="atom",shape=ellipse];
n36 [label="b ",shape=box];
n35 -- n36;
n34 -- n35;
n33 -- n34;
n32 -- n33;
n31 -- n32;
n37 [label="postfix",shape=ellipse];
n38 [label="exp",shape=ellipse];
n39 [label="comp",shape=ellipse];
n40 [label="atom",shape=ellipse];
n41 [label="\pm",shape=box];
n40 -- n41;
n39 -- n40;
n38 -- n39;
n37 -- n38;
n31 -- n37;
n42 [label="postfix",shape=ellipse];
n43 [label="exp",shape=ellipse];
n44 [label="comp",shape=ellipse];
n45 [label="func",shape=ellipse];
n46 [label="\sqrt",shape=box];
n45 -- n46;
n47 [label="\{",shape=box];
n45 -- n47;
n48 [label="expr",shape=ellipse];
n49 [label="additive",shape=ellipse];
n50 [label="additive",shape=ellipse];
n51 [label="mp",shape=ellipse];
n52 [label="unary",shape=ellipse];
n53 [label="postfix",shape=ellipse];
n54 [label="exp",shape=ellipse];
n55 [label="exp",shape=ellipse];
n56 [label="comp",shape=ellipse];
n57 [label="atom",shape=ellipse];
n58 [label="b",shape=box];
n57 -- n58;
n56 -- n57;
n55 -- n56;
n54 -- n55;
n59 [label="\^",shape=box];
n54 -- n59;
n60 [label="atom",shape=ellipse];
n61 [label="2",shape=box];
n60 -- n61;
n54 -- n60;
n53 -- n54;
n52 -- n53;
n51 -- n52;
n50 -- n51;
n49 -- n50;
n62 [label="\-",shape=box];
n49 -- n62;
n63 [label="additive",shape=ellipse];
n64 [label="mp",shape=ellipse];
n65 [label="unary",shape=ellipse];
n66 [label="postfix",shape=ellipse];
n67 [label="exp",shape=ellipse];
n68 [label="comp",shape=ellipse];
n69 [label="atom",shape=ellipse];
n70 [label="4ac",shape=box];
n69 -- n70;
n68 -- n69;
n67 -- n68;
n66 -- n67;
n65 -- n66;
n64 -- n65;
n63 -- n64;
n49 -- n63;
n48 -- n49;
n45 -- n48;
n71 [label="\}",shape=box];
n45 -- n71;
n44 -- n45;
n43 -- n44;
n42 -- n43;
n31 -- n42;
n29 -- n31;
n28 -- n29;
n27 -- n28;
n26 -- n27;
n23 -- n26;
n72 [label="\}",shape=box];
n23 -- n72;
n73 [label="\{",shape=box];
n23 -- n73;
n74 [label="expr",shape=ellipse];
n75 [label="additive",shape=ellipse];
n76 [label="mp",shape=ellipse];
n77 [label="unary",shape=ellipse];
n78 [label="postfix",shape=ellipse];
n79 [label="exp",shape=ellipse];
n80 [label="comp",shape=ellipse];
n81 [label="atom",shape=ellipse];
n82 [label="2a",shape=box];
n81 -- n82;
n80 -- n81;
n79 -- n80;
n78 -- n79;
n77 -- n78;
n76 -- n77;
n75 -- n76;
n74 -- n75;
n23 -- n74;
n83 [label="\}",shape=box];
n23 -- n83;
n22 -- n23;
n21 -- n22;
n20 -- n21;
n19 -- n20;
n18 -- n19;
n17 -- n18;
n16 -- n17;
n15 -- n16;
n3 -- n15;
n2 -- n3;
n1 -- n2;
}
from inmemantlr.
Thanks julian, option 1 is working fine and its getting result with equation elements.
but I am bit confuse about option 2. how do I get equation element without chain of rules?? for example, I don't want following rules from n1-n12 in tree because the are use less in my case.
n1 [label="root",shape=ellipse];
n2 [label="math",shape=ellipse];
n3 [label="relation",shape=ellipse];
n4 [label="relation",shape=ellipse];
n5 [label="expr",shape=ellipse];
n6 [label="additive",shape=ellipse];
n7 [label="mp",shape=ellipse];
n8 [label="unary",shape=ellipse];
n9 [label="postfix",shape=ellipse];
n10 [label="exp",shape=ellipse];
n11 [label="comp",shape=ellipse];
n12 [label="atom",shape=ellipse];
n13 [label="x",shape=box];
from inmemantlr.
The second option is interesting if you just want to extract the terminals from the input string. As an example you could do this with:
String quad = "x=\\frac{-b \\pm \\sqrt{b^2-4ac}}{2a}";
// ...
DefaultTreeListener t = new DefaultTreeListener();
// ...
cast.getLeafs().stream().forEach(
l -> System.out.println("\"" + quad.substring(l.getSidx(), l.getEidx()+1) + "\"")
);
This works because the leafs are always directly linked to a matching token. In case to simplify your parse tree a bit by filtering certain rule, you can actually pass a filter to the constructor of the DefaultTreeListener
(
Parse Tree Pruning
in the documentation.from inmemantlr.
from inmemantlr.
Related Issues (20)
- Error when using with spring boot and antlr4 HOT 7
- Java 9 compatibility
- Compiler Options HOT 1
- Serialize grammar to project classpath HOT 1
- Add extra flag to DefaultTreeListener for terminal nodes HOT 1
- Generate and compile in memory only the lexer HOT 5
- Replace eclipse compiler with standard JavaCompiler API
- Better tree manipulation support
- JSON/XML serialization in topological order HOT 1
- Improve output for dot format
- Cleanup API
- Can't load any v4 grammars HOT 10
- I can not run all test case HOT 2
- How to get all the nodes after filtering HOT 12
- How to use "Utility Files" HOT 1
- Line and Column Error in ParseTreeNode HOT 5
- Can't parse with DE-serialized parser HOT 1
- class Version HOT 2
- InmemantlrTool uses only terminal-defined package names, ignoring action-defined ones
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from inmemantlr.