kuseman / payloadbuilder Goto Github PK
View Code? Open in Web Editor NEWSQL query engine
License: Apache License 2.0
SQL query engine
License: Apache License 2.0
Today if there is a nested mapping a simple where fails because it's not wrapped in a nested filter.
{
"nestedType": {
"type": "nested",
"properties": {
"value": {
"type": "integer"
}
}
}
}
select *
from _doc
where nestedType.value = 10
This should work and should produce a (partial) body:
{
"nested": {
"path": "nestedType",
"filter": {
"term": {
"value": 10
}
}
}
}
select top 1 @var = t.col
from table t
When you don't have narrow filters you can often get pretty big & heavy requests that take a long time to fully load in the PLB, this could maybe be used to alleviate that.
If a query like this is executed:
Select * From purchase p Where s.amount > 10
ie. a misspelling or similar this query keeps on going without any hits, the intention here was p.amount > 10
clearly.
To avoid such mistakes introduce these rules:
This way from the example above rule number 2 would kick in and we would get a Invalid table source reference 's'
When selecting asterisk data from an alias and the columns changes for rows, null
is return for a column where the actual value is not null.
Problem is located in PayloadBuilderService where the columns changes on rows is not correctly detected.
To avoid as much IO as possible it would be preferable to be able to cache as much as possible, even between different queries.
Add support for a new kind of table option(s)
with (cacheKey = <cache-expression>, cacheTTL = 10)
This only makes sense when having an index of the cached table source, otherwise it will be hard to cache because scan is the only alternative.
So for example (assuming an index on tableB)
select * from tableA a inner join tableB b with (cacheKey = listOf(a.id, @constant), cacheTTL=10) on b.id = a.id and b.active
Will put a caching-operator infront on index-operator for tableB.
The cache operator will collect keys that are missing from cache and fetch those from downstream and put to cache.
Cache-framework should be configurable from QuerySession
Predicates is wrongly pushed down on left joins
select *
from tableA a
left join tableB b
on b.col = a.col
where b.value <> ''
Here b.value <> ''
is pushed down to tableB which is wrong.
Is some plans we could rewrite the left join into a inner join if there is a predicate that checks for non null values but that will be another time, now I think the best is to never push anything down to a left joined table source and let the user handle that
Fetch ES version to be able to build better abstractions regaring query building etc.
When having a setting like:
JsonSettings settings = new JsonSettings();
settings.setRowSeparator("\n");
settings.setResultSetsAsArrays(true);
return settings;
and only having one result set no end-array is written.
This becuase it's handeled inside initResult and should be handled in endResult
When having a projection of a lambda ie. p.map(x -> x.name)
the output will be toString of the iterator
Ease of life, either
Having a sub query expression like:
select
(
select x.col
from open_rows(a) x
for object
)
from table a
don't work today because the TFV is beeing resolved to the destination alias => a
and having an alias and trying to use the alias yields a syntax error
Can be a bit difficult to solve becuase the framework today let's TVF's resolve the alias to
properly resolve qualifiers, so there needs to be some link between x
and destination a
above when resolving
The resulting map should contain all input keys.
for (TKey key : keys)
{
CacheEntry<List<Tuple>> entry = cache.get(key);
if (entry != null) <---- remove
{
result.put(key, entry.value);
}
}
When building the operator tree and a HashJoin was chosen then its just a index missing from choosing a BatchHashJoin and this info could be printed so session printer.
Today both catalog-extensions and user prefs. like recent files etc. resides in the same config.json.
This is problematic when releasing new versions with a bundled config-file for extensions since that would overwrite the user prefs.
Fix!
Today only __id
is an index candidate but every index/analyzed field in a mapping is a potential index candidate.
Which would need a query and not a mget to fetch.
Add support for this.
When having a group by today the count for example doesn't know it's contained in a group by and hence a count(1) counts the scalar value 1 not the count of the group for expression 1
From TSQL-doc:
COUNT(*) returns the number of items in a group. This includes NULL values and duplicates.
COUNT(ALL expression) evaluates expression for each row in a group, and returns the number of nonnull values.
COUNT(DISTINCT expression) evaluates expression for each row in a group, and returns the number of unique, nonnull values.
So COUNT(1) should be treated as COUNT([expr]) ie. that is count the scalar 1 for each row in the group
This should work but doesn't
select obj.value
from table
group by obj.value
There is an issue in OperatorBuilderUtils#createGroupBy which doesn't handle multi qualifiers correctly
The resolving today is a bit messy becuase the framework allows for functions to return multiple aliases
eg. unionall(alias1, alias2)
these kinds of constructs needs to go since they make the code to complex.
Instead we need to implement proper support for UNION operators etc.
Having a query to ES with an index yields a mget query to ES with ID's.
However if the index-property is of wildcard type (customer-*
) then an invalid query is made
and ES responds
{
"docs": [
{
"_index": "customer-*",
"_type": "type",
"_id": "id-to-doc",
"error": "[customer-*] missing"
}
]
}
Detect wildcard index and switch mget to a regular query instead.
Add some logic in query parser and try to fix better messages for common errors.
Also move the existing parser errors into a new class so it can be reused in Queryeer.
Today its only supported to query full row information in subqueries:
select *
from
(
from table a
inner join table b
on b.col = a.col
) x
To fully comply with ANSI sql we need:
select *
from
(
select a.value, b.value <------- THIS
from table a
inner join table b
on b.col = a.col
) x
When parsing we can detect easy things as same expression on both side of comparison expressions etc. This should be added to compiler warnings as some kind of info level to let clients handle this.
Would also be nice to have some kind of compiler options that could append a warning if a table scan is used to be able to detect faulty queries as early as possible.
When using search function in ESCatalog that outputs aggregations, these ones is not returned in output
Add a xml_value like json_value function to parse xmls.
Also investigate xpath and see if that is an option
It's very common to want to query something different from the same context (same catalog settings etc.) and hence opening a new tab.
But this loses all context now so one have to setup the context from scratch.
Copy the current context to the new context. (copy constructor on QuerySession?)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.