HBase Notes Ⅱ

How To Use

  • method setFilter() is desigend for Get and Scan to set the filter. Like this way:

setFilter(filter);

CompareFilter

  • Filters succeeded from CompareFilter provides the several parameters to filter the value, such as LESS, LESSOREQUAL, EQUAL, NOTEQUAL, GREATEROREQUAL, GREATER, NO_OP. Followings are the filter enumerations of CompareFilter.
    • BinaryComparator
    • BinaryPrefixComparator(from left to right)
    • BitComparator(Operation on bits)
    • NullComparator(filter with null value)
    • RegexStringComparator(Regular expression is used in this case)
    • SubStringComparator(All the parameters are regarded as strings)
  • An example for filters with column families as returnings:
    1
    2
    3
    4
    5
    6
    7
    Filter filter3 = new RowFilter(CompareFilter.CompareOp,EQUAL,new SubstringComparator("-5"));
    scan.setFilter(filter3);
    ResultScanner scanner3 = table.getScanner(scan);
    for(Result res:scanner3){
    System.out.println(res);
    }
    scanner3.close();

FilterBase

  • because CompareFilter provides only operation on rows, FilterBase is designed here for column filtering here. Here are the enumerations of the class inherited from this base class.
    • SingleColumnValueFilter, to get the row entries according to the column value.
    • SingleColumnValueFilter, to get the row entries according to the colum value but not to contain the filter column(s).
    • PrefixFilter, return rows that satisfy the prefix.
    • KeyOnlyFilter, return only the key without any value.
    • InclusiveStopFilter, return rows that contain the stop line.
    • FirstKeyOnlyFilter, This kind of ckass is usually utilized for rows statistics. PS. because the first key of a line is usually the first-established column, this kind of class is widely used to get the olddest column value as well
    • TimeStampsFilter, as the name inplied, filter with the timestamps.
    • ColumnCountGetFilter, restrict the number of columns which will be returned.
    • ColumnPaginationFilter, Paginate the column of one line.
    • ColumnPrefixFilter, filter the data according to the prefix of the column name.
    • RandomRowFilter, filter with a random value.

###Addtional Filter

  • This kind of filter deal with the results of other filters to get the second-filtering results.
    • WhileMatchFilter. Its function is similar with the above one. However, compared with that, it will not continue scanning after finding one column value not stisfying the condition. Therefore, it’s more effcient.
    • SkipFilter. If some colimn of a line needs filtering, the whole line will be removed from the return queue. More details:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Filter filter1 = new ValueFilter(CompareFilter.CompareOp.NOT_EQUAL,
new BinaryComparator(Bytes.toBytes("bal=0")));
Scan scan = new Scan();
scan.setFilter(filter1);
ResultScanner scanner1 = table.getScanner(scan);
for(Result result : scanner1){
for(KeyValue kv : result.raw()){
System.out.println("KV:" + kv + ",Value: " +
Bytes.toString(kv.getValue()));
}
}
scanner1.close();
Filter filter2 = new SkipFilter(filter1);
scan.setFilter(filter2);
ResultScanner scanner2 = table.getScanner(scan);
for(Result result : scanner2){
for(KeyValue kv : result.raw()){
System.out.println("KV: " + kv + ",Value: " +
Bytes.toString(kv.getValue()));
}
}
scanner2.close():

###FilterList

  • FilterList accepts several filters to work sImultaneously. More details:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
ListFilter filters = new ArrayListFilter();
Filter filter1 = new RowFilter(CompareFilter.CompareOp.GREATER_OR_EQUAL,
new BinaryComparator(Bytes.toByte("row=03")));
filters.add(filter1);
Filter filter2 = new RowFilter(
CompareFilter.CompareOp.LESS_OR_EQUAL,
new BinaryComparator(Bytes.toByte("row=06")));
filters.add(filter2);
Filter filter3 = new RowFilter(
CompareFilter.CompareOp.EQUAL,
new RegexStringComparator("col=0{03}"));
filters.add(filter3);
FilterList filterList1 = new FilterList(filters);
Scan scan = new Scan();
scan.setFilter(filterList);
ResultScanner scanner1 = table.getScanner(scan);
for(Result result : scanner1){
for(KeyValue kv : result.row()){
System.out.println("KV: " + kv + ",Value: "+
Bytes.toString(kv.getValue()));
}
}
scanner1.close(); //return combined results
FilterList filterlist2 = new FilterList(
FilterList.Operator.MUST_PASS_ONE,filters);
scan.setFilter(filterList2);
ResultScanner scanner2 = table.getScanner(scan);
for(Result result: scanner2){
for(KeyValue kv: result.raw()){
System.out.println("KV: " + kv + ",Value: "+
Bytes.toString(kv.getValue()));
}
}
scanner2.close(): //return all the lines

###Custom Filter

  • Users can implement Filter interface or inherit from the FilterBase class to define custom filter. The following example is to find the specific line:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
public class CustomFilter extends FilterBase{
private byte[] value = null;
private boolean filterRow = true;
public CustomFilter(){
super();
}
public CustomFiltter(byte[] value){
this.value = value;
}
@Override
public void reset(){
this.filteRow = true;
}
@Override
public RetrunCode filreKeyValue(KeyValue kv){
if(Bytes.compareTo(value,kv.getValue())==0){
filterRow = false;
}
return ReturnCode.INCLUDE;
}
@Override
public boolean filterRow(){
return filterRow;
}
@Override
public void write(DataOutput dataOutput) throws IOException{
Bytes.writeByteArray(dataOutput, this.vale);
}
@Override
public void readFields(DataInput dataInput)throws IOException{
this.value = Bytes.readByteArray(dataInput);
}
}
  • Ps. If you want to apply your custom filter into use, you have to pack it up into a JAR package, give it away to the region servers, modify the configuration files with name hbase-env.sh and restart the deamon. After that, you can test your filters. One more example for modifying configuration file:
1
2
3
# Extra Java CLASSPATH elements. Optional.
# export HBASE_CLASSPATH =
export HBASE_CLASSPATH = "$1" // $1=[your jar file path]