Wednesday, June 29, 2016

/app keep growing in Cassandra (Cassandra)

During nodetool repair job, we found an interesting issue that /app goes up quickly and drops down after job finishing.

The official document says:
"By default, the repair command takes a snapshot of each replica immediately and then sequentially repairs each replica from the snapshots. For example, if you have RF=3 and A, B and C represents three replicas, this command takes a snapshot of each replica immediately and then sequentially repairs each replica from the snapshots (A<->B, A<->C, B<->C) instead of repairing A, B, and C all at once. This allows the dynamic snitch to maintain performance for your application via the other replicas, because at least one replica in the snapshot is not undergoing repair."

So basically during this repair job, data will be checked thru all 3 nodes(if RF=3) and sync any inconsistency (if there is any).

If the repair jobs on all nodes are set start at the same time, this causes HUGE resource issues and data usage issues because snapshots will be created for this job cross all nodes!!

So the solution is:
1) run the repair job on one node, mark the time.
2) cron the jobs accordingly for all nodes without overlapping each other.
3) keep an eye on the cron jobs to ensure they work as expected.

No comments:

Post a Comment

My own Mind Map program in Java script and Python

I had been searching online mindmap apps for my study for a while and never got one that I am really happy with.  Then I asked myself what I...