Skip to content

rekki/go-query

Folders and files

NameName
Last commit message
Last commit date
Apr 13, 2020
Jul 16, 2020
Apr 12, 2020
Apr 12, 2020
Feb 9, 2020
Jul 16, 2020
Jan 6, 2020
Nov 2, 2020
Nov 2, 2020
Nov 2, 2020
Nov 2, 2020
Apr 13, 2020
Apr 13, 2020
Nov 2, 2020
Nov 2, 2020
Apr 9, 2020
Nov 2, 2020
Apr 22, 2022
Aug 14, 2021
Nov 2, 2020
Apr 22, 2022

Repository files navigation

go-query simple []int32 query library GitHub Actions Status codecov GoDoc

Blazingly fast query engine

Used to build and execute queries such as:

n := 10 // total docs in index

And(
        Term(n, "name:hello", []int32{4, 5}),
        Term(n, "name:world", []int32{4, 100}),
        Or(
                Term(n, "country:nl", []int32{20,30}),
                Term(n, "country:uk", []int32{4,30}),
        )
)
  • scoring: only idf score (for now)
  • supported queries: or, and, and_not, dis_max, constant, term
  • normalizers: space_between_digits, lowercase, trim, cleanup, etc
  • tokenizers: left edge, custom, charngram, unique, soundex etc
  • go-query-index: useful example of how to build more complex search engine with the library

query

Usually when you have inverted index you endup having something like:

data := []*Document{}
index := map[string][]int32{}
for docId, d := range document {
     for _, token := range tokenize(normalize(d.Name)) {
        index[token] = append(index[token],docId)
     }
}

then from documents like {hello world}, {hello}, {new york}, {new world} you get inverted index in the form of:

{
    "hello": [0,1],
    "world": [0,3],
    "new": [2,3],
    "york": [2]
}

anyway, if you want to read more on those check out the IR-book

This package helps you query indexes of this form, in fairly efficient way, keep in mind it expects the []int32 slices to be sorted. Example:

package main

import (
    "fmt"

    "github.com/rekki/go-query"
)

func main() {
    totalDocumentsInIndex := 10
    q := query.And(
        query.Or(
            query.Term(totalDocumentsInIndex, "a", []int32{1, 2, 8, 9}),
            query.Term(totalDocumentsInIndex, "b", []int32{3, 9, 8}),
        ),
        query.AndNot(
            query.Or(
                query.Term(totalDocumentsInIndex, "c", []int32{4, 5}),
                query.Term(totalDocumentsInIndex, "c", []int32{4, 100}),
            ),
            query.Or(
                query.Term(totalDocumentsInIndex, "d", []int32{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}),
                query.Term(totalDocumentsInIndex, "e", []int32{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}),
            ),
        ),
    )

    // q.String() is {{a OR b} AND {{d OR e} -({c OR x})}}

    for q.Next() != query.NO_MORE {
        did := q.GetDocId()
        score := q.Score()
        fmt.Printf("matching %d, score: %f\n", did, score)
    }
}

will print:

matching 1, score: 2.639057
matching 2, score: 2.639057
matching 3, score: 2.852632
matching 8, score: 2.639057
matching 9, score: 4.105394