Elasticsearch Adventure(2): Better Practice?

With the development and testing of search engine, Tony met more problems. Now, he learned much basic things about Elasticsearch, and he is seeking better practice for problems.

Query Condition Combination

Tony, “Mentor, I am using rest API of Elasticsearch to assist the access of Elasticsearch for the time being. Now we have a requirement of very flexible searching: user can choose and combine different fields using checkbox to search. I am wondering how to implement it gracefully. For now, I can only think about using different pre-write query string of ES, but it seems cumbersome and I think I may need some string builder to compose a customized query.”

Tim, “Em, maybe QueryBuidlers of Elasticsearch Java API is what you want. Let’s say we want a bool query¹, and the condition for query may exists or not. So we can do like following:”

BoolQueryBuilder bool = boolQuery()
    .should(matchQuery("must_have1name", info.getQuery()))
    .should(nestedQuery("nested1", matchQuery("nested1.des", info.getQuery()), ScoreMode.Avg));
if (info.A() != null) {
  bool.filter(termQuery("a", info.A()));
}
if (info.B() != null) {
  bool.filter(termQuery("b", info.B()));
}

y Tony, “Great, this is what I want. I can use QueryBuilder to compose any query user request. Thanks.”

Tree Structure

Tony, “Mentor, I met another problem: I have to implement foreign key like relationships in ES.”

Tim, “Have you tried parent-child relationship and nested object?”

Tony, “Yes, parent-child relationship is suitable for one-to-many mapping but can’t refer the same time as parent. For the same reason, we can’t use nested object.”

Tim, “So, you mean you are storing some self referential document. For the time being, ES doesn’t support this kind of document. Actually, ES (as far as the owner) doesn’t recommend any relationship between different index. Every document should be self contained. Query one index is enough.”

Tony, “So do I have to solve this in business logic layer?”

Tim, "Maybe not. In very special cases, like this file system example, we can use customized analyzer and [multi-fields](https://www.elastic.co/guide/en/elasticsearch/creereerfied to achieve sort of self reference. For more information about relationships in ES, you may like to read here.

Update by Query

Today, Tony invite Tim to review his code. In viewing some part of business logic, Tim found some code need to improve.

// query `BPO` whose `aid` is what we want
List<BPO> bPOs = bRepo.findByAId(aid);
List<String> ids = bPOs.stream().map(BPO::getId).collect(Collectors.toList());
// use `BPO`'s id to update ...
for (...) {
	UpdateQuery updateQuery = new UpdateQueryBuilder()
	    // class is used to infer `index` and `type`
	    .withClass(BPO.class)
	    .withId(bpo.getId())
	    // indexRequest will be used as `doc`
	    .withIndexRequest(indexRequest).build();
	template.update(updateQuery);
}

Tim, “This part of code seems need some improvements. What do you think?”

Tony, “Oh, I can use batch request to update them in one request, my mistakes.”

Tim, "Great. Only one more questions, can we do better? Can we not fetch BPO but update by query, like what we do in SQL? ES actually if (info.C() != null) { bool.filter(termQuery(“c”, info.C())); } if (info.D() != null) { bool.filter(termQuery(“d”, info.D())); }


#### Updaropertyasticseareference/.html)Allocation

A shard is not free. Remember:

A shard is a Lucene index under the covers, which uses file handles, memory, and CPU cycles.
Every search request needs to hit a copy of every shard in the index. That’s fine if every shard is sitting on a different node, but not if many shas the feature of `update_by_query`. Although this feature is not as powerful as it in SQL world (it can only search and update same index), this features is suitable for your use case. Furthermore, it can also be used to [pick up new propertyhtpeas/guide/en/elhtpeasoueeasticsearch/reference/current/docs-update-by-query.html#picking-up-a-new-property)"

### Ref

 - [Relations in EShtpeas/guide/en/elasticsearch/guide/master/relations.html)
 - [Multi-fields in ES: index a field in different wayrds have to compete for the same resources.
Term statistics, used to calculate relevance, are per shard. Having a small amount of data in many shards leads to poor relevance.

### Opt Index Rating

- [Tune For Indexing Speed](https://www.elastic.cooueeasticseacreereerrnmufieldsmaster/tune-for-indexintg-spedtl

> Written with [StackEdit](https://stackedit.io/).

bool query is a compound query syntax of Elasticsearch, details can be found here. ↩︎

On teh way

Blog Search