-
Notifications
You must be signed in to change notification settings - Fork 10
/
Copy pathHACKING
85 lines (74 loc) · 4.13 KB
/
HACKING
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
Want to contribute? Great! This file will guide you through some of
design decision / ideology you should try to adhere to.
-------
This document is outdated and will be updated with the asyncbigtable details.
The build process has changed and currently there are no unit tests to run.
However, the basic programming principles and coding guidelines are always valid.
-------
Getting started:
# You need this & javac on $PATH. The first run will download dependencies.
$ make
# Run the unit tests
$ make check
# Run the integration tests. See also "Testing" below.
$ HBASE_HOME=~/src/hbase make integration ARGS='test f'
The basics:
- Know the complexity of your algorithms. When you're calling a
method in another library, make sure you're aware of the its
complexity. Is it O(n2) or O(n log n)? How much memory will it
allocate? For instance, Collections.sort() works in O(n log n)
as expected but it does 3 array copies! So it's O(3n) = O(n) in
space instead of O(1).
- Avoid allocating objects whenever you can. If you can, re-use the
same objects in a loop in order to avoid generating too much garbage
for the GC to collect. Objects that are expected to be used very
frequently should be create once and stored in an attribute.
The build system:
- No Ant / Ivy / Maven madness. Using XML to specify a build system
is a dumb idea and those tools are way too slow and don't even do
proper dependency tracking out of the box. They don't come with a
set of standard rules that make packaging easy.
- Don't add new dependencies unless you have a compelling reason to do
so. The code of any dependency will be audited before we can allow
this new dependency.
Code:
- The #1 rule is to write readable code. How much time will it take
for someone else to understand that method? If something is not
obvious, it should be commented.
- Stick to the coding style in the files you're editing.
- Avoid lines longer than 80 characters. There are a few cases where
Java is so verbose that trying to wrap things around to avoid long
lines makes the code unnecessarily hard to read.
- Everything should be properly Javadoc'ed. Everything. Including
private stuff (except maybe 1-line private helper functions).
Document the RuntimeExceptions one should expect to handle.
All javadoc comments are to be written in proper English.
- The javadoc (internal and user-facing) must document the
synchronization requirements. What is thread-safe? Which monitor
needs to be acquired before accessing this attribute?
- No checked exceptions. People should RTFM and handle whatever
exceptions they want to handle. There's a disagreement in the
community about the usefulness and effectiveness of checked
exceptions. I haven't been convinced about the pros so I chose
to not use them.
- Use fine-grained exception types. Everything must derive from
HBaseException. Exceptions should contain all the data required to
help the user code recover from them. HBase's own code is a very
good example of how to not (ab)use IOException.
- Local variables are named_like_this, I think it's more readable
compared to somethingLikeThis. I think there's a reason why latin
languages put a space between words.
Testing:
- Write unit tests whenever possible, especially regression tests.
- Integration tests need to exercise the functionality against a real
HBase instance. You can run them with this command:
$ HBASE_HOME=~/src/hbase make integration ARGS='test f'
This will run the tests against a set of tables that all have names
starting with `test' and families starting with `f'. The tables will
be created / dropped when needed.
- Tip: After you've run the integration tests once, you can speed up
subsequent runs by passing TEST_NO_TRUNCATE=1 before HBASE_HOME on
the command above. This will skip truncating the table in between
tests, which makes them run much faster.
- Tip: You can also specify only one integration test to run, by adding
the following before HBASE_HOME on the command above: TEST_NAME=XXX.