Janusgraph ID 编码

Date: 2019/07/16 Categories: 工作 Tags: janusgraph


因为想要自己构建一些对janusgraph的混合索引的请求, 所以需要研究一下JG是如何在ES中存储id的

首先构建es query

{
    "query": {
        "match": {
            "子类型": "动作"
        }
    },
    "sort": [
        {"popular":{"order":"desc"}}
    ],
    "size" :2,
    "_source": ["name"]
}

命令为

curl -H 'Content-Type: application/json' localhost:9200/esindex_po_mixedindex/mixedIndex/_search -d @query.json

如果只需要id的话, 可以在json中设置, 同时把请求url改为localhost:9200/esindex_po_mixedindex/mixedIndex/_search?filter_path=hits.hits._id

{"_source": false}

返回类似

_shards:
  failed: 0
  skipped: 0
  successful: 4
  total: 4
hits:
  hits:
    - _id: 33ush0p8g
      _index: esindex_po_mixedindex
      _score:
      _source:
        name:
          - 射雕英雄传
      _type: mixedIndex
      sort:
        - 922
    - _id: 1r8twwwg0
      _index: esindex_po_mixedindex
      _score:
      _source:
        name:
          - 苏丹
      _type: mixedIndex
      sort:
        - 905
  max_score:
  total: 42263
timed_out: false
took: 2

这里可以看到id是33ush0p8g和1r8twwwg0

来看一下janusgraph中是如何转换为integer的.具体代码在janusgraph/janusgraph-core/src/main/java/org/janusgraph/util/encoding/LongEncoding.java

    public static long decode(String s, String symbols) {
        final int B = symbols.length();
        long num = 0;
        for (char ch : s.toCharArray()) {
            num *= B;
            int pos = symbols.indexOf(ch);
            if (pos<0) throw new NumberFormatException("Symbol set does not match string");
            num += pos;
        }
        return num;
    }

用python改写一下

def decode(s, symbols="0123456789abcdefghijklmnopqrstuvwxyz"):
    B = len(symbols)
    num = 0
    for c in s:
        num *= B
        num += symbols.index(c)
    return num