Code Monkey home page Code Monkey logo

cuse's People

Contributors

dependabot[bot] avatar georgievjon avatar ivanlazov avatar krasidimitrov avatar mgenov avatar mlesikov avatar vasilmitovclouway avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cuse's Issues

Should be possible to register a list of indexes

now we should write it this way

 for (Address address : addresses) {
      searchEngine.register(new AddressIndex(address));
    }

it would be a lot easier if we can just call

  searchEngine.registerAll(addressIndexes);

Allow searching by Date

Provide a way to execute search queries by Date.

Should be possible to execute searches by Date applying the following inequality operators ( >, >=, <, <=, =)

composite or queries

cuse may provide and composite queries with "or" and "and"

// Query: "(deletedOn >= :nextDay or deleted:true)"

Word Breakdown Comparison

During development of searching for client device by partial serial number we came up with TextIndex which will provide a set of indexes, the all combinations of subwords for a given set of words.
This is simple comparison between the two approaches (TextIndex and IndexWriter).
You can use it if you find it useful.

Test word : mississippi

public class TextIndex {
  private final Set<String> words;

  public TextIndex(String... words) {
    this.words = newLinkedHashSet(newArrayList(words));
  }

  public List<String> generate() {
    Set<String> index = newLinkedHashSet();

    for (String word : words) {
      word = word.toUpperCase();

      for (int i = 0; i < word.length(); i++) {
        for (int j = i + 1; j < word.length(); j++) {
          index.add(word.substring(i, j));
        }

        index.add(word.substring(i));
      }
    }

    return newArrayList(index);
  }
}

Generated Indexes: 53

M
MI
MIS
MISS
MISSI
MISSIS
MISSISS
MISSISSI
MISSISSIP
MISSISSIPP
MISSISSIPPI
I
IS
ISS
ISSI
ISSIS
ISSISS
ISSISSI
ISSISSIP
ISSISSIPP
ISSISSIPPI
S
SS
SSI
SSIS
SSISS
SSISSI
SSISSIP
SSISSIPP
SSISSIPPI
SI
SIS
SISS
SISSI
SISSIP
SISSIPP
SISSIPPI
ISSIP
ISSIPP
ISSIPPI
SSIP
SSIPP
SSIPPI
SIP
SIPP
SIPPI
IP
IPP
IPPI
P
PP
PPI
PI

IndexWriter

Generated Indexes: 25

ssissippi
issippi
sissippi
mis
ssissip
iss
sippi
mississ
m
ippi
ississipp
mississipp
mississippi
i
ppi
missi
mississip
missis
mississi
ssippi
sissi
ississippi
pi
mi
miss

replace Indexing Strategy object with annotations

example strategy

public class SearchIndexStrategy implements IndexingStrategy<SearchIndex> {

  @Override
  public String getIndexName() {
    return SearchIndex.class.getSimpleName();
  }

  @Override
  public String getId(SearchIndex index) {
    return String.valueOf(index.getId());
  }

  @Override
  public IndexingSchema getIndexingSchema() {
    return IndexingSchema.aNewIndexingSchema()
            .fields("id", "state", "creationDate", "tags", "operationZones")
            .fullTextFields("city", "street", "customer")
            .build();
  }

can be changed like that :

 @SearchIndex("SearchIndex")
 public class SearchIndex {
  @SearchIndexId
  private Long id;

  @FullWordSearch
  private String city;

  @FullTextSearch
  private String street;

  @FullWordSearch
  private String customer;

  private String state;

  @SearchableDate
  private Date creationDate;

  private List<String> tags = Lists.newArrayList();
  private List<Long> operationZones = Lists.newArrayList();


  public SearchIndex(Object object) {
    id = object.getId();
    operationZones = Lists.newArrayList(object.getOperationZoneIds());

    if (object.getState() != null) {
      state = object.getState().getState();
    }

    if (object.getCreationInfo() != null) {
      creationDate = object.getCreationInfo().getCreationDate();
    }

    if (object.getTags() != null) {
      tags = Lists.newArrayList(object.getTags());
    }

    //the post code should be separated property and not full text search , for better search
    Address serviceAddress = object.getServiceAddress();
    if (serviceAddress != null) {
      city = serviceAddress.getCityLine();
      street = serviceAddress.getAddressLine();
    }

    if (object.getClient() != null) {
      String customerLine = ObjectLineBuilder.line()
              .wordSuf(object.getClient().getNameLine(), " ")
              .wordSuf(object.getClient().getEmail(), " ")
              .wordSuf(object.getClient().getTelephone(), " ")
              .build();

      customer = customerLine;
    }

  }

//not sure that we will need this when we have @SearchIndexId annotation
  public Long getId() {
    return id;
  }
}

suggestion: proper exceptions should be thrown, so while something important is missing it will be easy to fix.

take a look in this issue too register type converters #13

Sort by date (asc/desc order) and passed offset

In the past we had problems on the local environment while executing different search queries which sorts the returned results by date (asc/desc order) and passed offset.

We've ran some tests once again to verify thit behavior, but as result we received correct data. We will accept that the API is working fine for now.

Here are some of the tests

  @Test
  public void sortConsecutiveRecordsByDateAndPassedOffsetAsc() {

    store(aNewEmployee().id(1l).birthDate(aNewDate(2014, 1, 3)).build());
    store(aNewEmployee().id(2l).birthDate(aNewDate(2014, 1, 5)).build());
    store(aNewEmployee().id(3l).birthDate(aNewDate(2014, 1, 8)).build());
    store(aNewEmployee().id(4l).birthDate(aNewDate(2014, 1, 13)).build());
    store(aNewEmployee().id(5l).birthDate(aNewDate(2014, 1, 21)).build());

    List<Employee> result = searchEngine.search(Employee.class)
            .sortBy("birthDate", SortOrder.ASCENDING, SortType.NUMERIC)
            .offset(2)
            .returnAll()
            .now();

    assertThat(result.size(), is(3));
    assertThat(result.get(0).id, is(3l));
    assertThat(result.get(1).id, is(4l));
    assertThat(result.get(2).id, is(5l));
  }

  @Test
  public void sortNonConsecutiveRecordsByDateAndPassedOffsetAsc() {

    store(aNewEmployee().id(5l).birthDate(aNewDate(2014, 1, 21)).build());
    store(aNewEmployee().id(2l).birthDate(aNewDate(2014, 1, 5)).build());
    store(aNewEmployee().id(4l).birthDate(aNewDate(2014, 1, 13)).build());
    store(aNewEmployee().id(1l).birthDate(aNewDate(2014, 1, 3)).build());
    store(aNewEmployee().id(3l).birthDate(aNewDate(2014, 1, 8)).build());

    List<Employee> result = searchEngine.search(Employee.class)
            .sortBy("birthDate", SortOrder.ASCENDING, SortType.NUMERIC)
            .offset(4)
            .returnAll()
            .now();

    assertThat(result.size(), is(1));
    assertThat(result.get(0).id, is(5l));
  }

  @Test
  public void sortConsecutiveRecordsByDateAndPassedOffsetDesc() {

    store(aNewEmployee().id(1l).birthDate(aNewDate(2014, 1, 5)).build());
    store(aNewEmployee().id(2l).birthDate(aNewDate(2014, 1, 10)).build());
    store(aNewEmployee().id(3l).birthDate(aNewDate(2014, 1, 15)).build());
    store(aNewEmployee().id(4l).birthDate(aNewDate(2014, 1, 4)).build());

    List<Employee> result = searchEngine.search(Employee.class)
            .sortBy("birthDate", SortOrder.DESCENDING, SortType.NUMERIC)
            .offset(1)
            .fetchMaximum(10)
            .now();

    assertThat(result.size(), is(3));
    assertThat(result.get(0).id, is(2l));
    assertThat(result.get(1).id, is(1l));
  }

  @Test
  public void sortNonConsecutiveRecordsByDateAndPassedOffsetDesc() {

    store(aNewEmployee().id(2l).birthDate(aNewDate(2014, 1, 10)).build());
    store(aNewEmployee().id(4l).birthDate(aNewDate(2014, 1, 4)).build());
    store(aNewEmployee().id(1l).birthDate(aNewDate(2014, 1, 5)).build());
    store(aNewEmployee().id(3l).birthDate(aNewDate(2014, 1, 15)).build());

    List<Employee> result = searchEngine.search(Employee.class)
            .sortBy("birthDate", SortOrder.DESCENDING, SortType.NUMERIC)
            .offset(2)
            .fetchMaximum(10)
            .now();

    assertThat(result.size(), is(2));
    assertThat(result.get(0).id, is(1l));
    assertThat(result.get(1).id, is(4l));
  }

Search by fields with values containing underscores

In the documentation of GAE Search API it is said that underscores do not break up words.

We've written some tests to verify that behaviour and as a result we couldn't receive the expected results from executing some of our tests.

Here is an example test.

@Test
  public void searchForFieldThatContainsUnderscore() {

    store(aNewEmployee().id(1l).firstName("John Adam").build());
    store(aNewEmployee().id(2l).firstName("John_Adam").build());

    List<Employee> result = searchEngine.search(Employee.class).where("firstName", SearchFilters.is("John")).returnAll().now();

    assertThat(result.size(), is(1));
  }

Running the following test fails. Instead of returning only one results, it returns both of them. We've deployed this code in the test application to try it out and as result we receive only one matching result (which was the expected results). It turns out that the local environment of the search api have a different behavior compared to the production and breaks up words when there is underscore.

Due to this fact in the future when we need to index some fields with values containing underscores it's better to avoid them.

Throws exception while search string includes "/"

Caused by: com.google.appengine.api.search.SearchQueryException: Unable to parse query: куче/088888888 closed:false locationId:(7057283 OR 7133203 OR 7173275 OR 7229040 OR 7237249 OR 7237264 OR 7241130 OR 7245148 OR 7247156 OR 7251005 OR 7251008 OR 7251009 OR 7251010 OR 7259004 OR 7268195 OR 7271007 OR 7271008 OR 7275002 OR 7277005 OR 7286002 OR 7286003 OR 7289002 OR 7290002 OR 7292002 OR 8804086 OR 6384259640066048 OR 5229884100050944) departmentId:(7295047 OR 7296057 OR 7297060 OR 7299058 OR 7301067 OR 7302062 OR 7302063 OR 7304046 OR 7305071 OR 7306056 OR 7308031)
at com.google.appengine.api.search.checkers.QueryChecker.checkQueryParses(QueryChecker.java:44)
at com.google.appengine.api.search.checkers.QueryChecker.checkQuery(QueryChecker.java:28)
at com.google.appengine.api.search.Query$Builder.setQueryString(Query.java:91)
at com.google.appengine.api.search.Query$Builder.build(Query.java:107)
at com.clouway.cuse.gae.GaeSearchApiMatchedIdObjectFinder.buildQuery(GaeSearchApiMatchedIdObjectFinder.java:53)
at com.clouway.cuse.gae.GaeSearchApiMatchedIdObjectFinder.find(GaeSearchApiMatchedIdObjectFinder.java:26)
at com.clouway.cuse.spi.Search.now(Search.java:128)

index whole words

When we store a field with value containing many words, we should store in the index each word. This way later we can execute searches by passing any of the stored words.

indexing: should be possible to have dynamic fields in index

Provided that we have a map of field names and values, generated from a user-filled nomenclature for example, a search should be able to be made by a specific field.
Example case:

@SearchIndex(name = "DeviceIndex")
public class DeviceIndex {
  @SearchId
  private Long id;
  private String type;
  ...
  @DynamicFields  // example new annotation for this purpose
  private Map<String, String> fields; //{"fieldName1": "value1", "fieldName2": "value2", ...}
  ...
}

That way in the search index the values of fields should be broken down for FullTextSearch, but the keys(the names of the dynamic fields) should not be broken down.

Potential problems where to store the field names in the index:

  • if stored as properties of the index there is a limitation for the names - they can contain only letters, digits and underscores, like a name of a variable.
  • if stored in the value of fields then what sort of delimiter will be used for separating key from value, and also when making the search how is the key going to be specified.

Search for matching field containing many values by passing list of values

We should be able to execute search queries on fields containing many values by passing a list of values.

For example if we have the following indexed field

tags: 1, 2, answered

after execute the following query we should receive results

searchEngine.search(Index.class).where("tags", SearchFilters.is(Arrays.asList("1", "answered"))).returnAll().now();

search/cache: backing cache support

We could improve search api, by adding of backing cache which to be used for caching of query results.

The eviction policy should be considered, cause we have to ensure that indexes are consistent between cache and api calls.

Interpret search query strings as one word when contains word with special characters

When search query is like this "12:34:c4" then string is interpreted as three separated words and found matches are many.
When search query contains words with special characters should be interpreted as one word and search query will lock like this ""12:34:c4"" after escaping.

Еxample:
search query : "Tarnovo 12:34:c4"
should be escaped like this: "Tarnovo "12:34:c4""

Unclear parameter "indexNmae" in the searchEngine.delete(indexName, ids) method

It is unclear which will be the index name.

searchEngine.get().delete("indexName", addressIds);

the index name is placed in the IndexingStrategy object, there is a method :

...
String getIndesName();
...

so would be better if we pass the index class(maybe interface is need), and then it can find the strategy and the index name, or we can pass directly the index strategy

register type converters

At some projects, users are using Custom data types for "Date & Time" objects, so we should provide a mechanism for registration of type converters for these types.

class MyIndex {
   private DateTime creationTime; 
}

...

GaeSearchApiCuseModule module = new GaeSearchApiCuseModule(TwigEntityLoader.class); 

module.registerTypeConverter(new Converter<DateTime,Date>() {
   public Date convert(DateTime dateTime) {
       if (dateTime == null) {
            return null; 
       }
       return dateTime.getDate();
   }
} 

So if we have such converters, we will be able to remove duplication in our index classes such as.

public TeamIndex(Team team){
    this.id = team.getId();
    this.locationIds = team.getLocationIds();
    this.departmentIds = team.getDepartmentIds();
    if(team.getDeletedOn() != null) {
      this.deletedOn = team.getDeletedOn().getDate();
    }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.