public abstract class LoadBalancerService extends AbstractService implements ILoadBalancerService, IServiceShutdown, IEventReportingService
LoadBalancerService collects a variety of performance counters
from hosts and services, identifies over- and under- utilized hosts and
services based on the collected data and reports those to DataService
s so that they can auto-balance, and acts as a clearing house for WARN and
URGENT alerts for hosts and services.
While the LoadBalancerService MAY observe service start/stop events,
it does NOT get directly informed of actions that change the load
distribution, such as index partition moves or reading from a failover
service. Instead, DataServices determine whether or not they are
overloaded and, if so, query the LoadBalancerService for the identity
of under-utilized services. If under-utilized DataServices are
reported by the LoadBalancerService then the DataService will
self-identify index partitions to be shed and move them onto the identified
under-utilized DataServices. The LoadBalancerService learns
of these actions solely through their effect on host and service load as
self- reported by various services.
Note: utilization should be defined in terms of transient system resources : CPU, IO (DISK and NET), RAM. DISK exhaustion on the other hand is the basis for WARN or URGENT alerts since it can lead to immediate failure of all services on the same host.
Note: When new services are made available, either on new hosts or on the
existing hardware, and service utilization discrepancies should become
rapidly apparent (within a few minutes). Once we have collected performance
counters for the new hosts / services, a subsequent overflow event(s) on
existing DataService(s) will cause index partition moves to be
nominated targeting the new hosts and services. The amount of time that it
takes to re-balance the load on the services will depend in part on the write
rate, since writes drive overflow events and index partition splits, both of
which lead to pre-conditions for index partition moves.
Note: If a host is suffering high IOWAIT then it is probably "hot for read" (writes are heavily buffered and purely sequential and therefore unlikely to cause high IOWAIT where as reads are typically random on journals even through a key range scan is sequential on index segments). Therefore a "hot for read" condition should be addressed by increasing the replication count for those service(s) which are being swamped by read requests on a host suffering from high IOWAIT.
IRequiredHostCounters, A core set of variables to support
decision-making.,
http://www.google.com/search?hl=en&q=load+balancing+jini| Modifier and Type | Class and Description |
|---|---|
static interface |
LoadBalancerService.Options
Options understood by the
LoadBalancerService. |
protected class |
LoadBalancerService.RoundRobinServiceLoadHelper
Integration with the
LoadBalancerService. |
protected class |
LoadBalancerService.ServiceLoadHelperWithoutScores
Integration with the
LoadBalancerService. |
protected class |
LoadBalancerService.ServiceLoadHelperWithScores
Integration with the
LoadBalancerService. |
protected class |
LoadBalancerService.UpdateTask
Computes and updates the
ServiceScores based on an examination
of aggregated performance counters. |
| Modifier and Type | Field and Description |
|---|---|
protected ConcurrentHashMap<UUID,ServiceScore> |
activeDataServices
The set of active services.
|
protected ConcurrentHashMap<String,HostScore> |
activeHosts
The active hosts (one or more services).
|
protected EventReceiver |
eventReceiver |
protected Journal |
eventStore
Used to persist the logged events.
|
protected int |
historyMinutes
The #of minutes of history that will be smoothed into an average when
LoadBalancerService.UpdateTask updates the HostScores and the
ServiceScores. |
protected AtomicReference<HostScore[]> |
hostScores
Scores for the hosts in ascending order (least utilized to most
utilized).
|
protected long |
initialRoundRobinUpdateCount
The #of updates during which
getUnderUtilizedDataServices(int, int, UUID) will apply a round
robin policy. |
protected boolean |
isTransient
true iff the LBS will refrain from writing state on the
disk. |
protected Condition |
joined
Used to await a service join when there are no services.
|
protected ReentrantLock |
lock
Lock is used to control access to data structures that are not
thread-safe.
|
protected static org.apache.log4j.Logger |
log |
protected File |
logDir
The directory in which the service will log the
CounterSets
and Events. |
protected long |
nupdates
The #of
LoadBalancerService.UpdateTasks which have run so far. |
protected String |
ps |
protected long |
serviceJoinTimeout
Service join timeout in milliseconds - used when we need to wait for a
service to join before we can recommend an under-utilized service.
|
protected AtomicReference<ServiceScore[]> |
serviceScores
Scores for the services in ascending order (least utilized to most
utilized).
|
protected ScheduledExecutorService |
updateService
Runs a periodic
LoadBalancerService.UpdateTask. |
| Constructor and Description |
|---|
LoadBalancerService(Properties properties)
Note: The load balancer MUST NOT collect host statistics unless it is the
only service running on that host.
|
| Modifier and Type | Method and Description |
|---|---|
void |
destroy()
Destroy the service.
|
protected void |
finalized() |
protected abstract String |
getClientHostname()
Return the canonical hostname of the client in the context of a RMI
request.
|
Properties |
getProperties()
An object wrapping the properties provided to the constructor.
|
Class |
getServiceIface()
Returns
ILoadBalancerService. |
UUID |
getUnderUtilizedDataService()
Return the
UUID of an under-utilized data service. |
UUID[] |
getUnderUtilizedDataServices(int minCount,
int maxCount,
UUID exclude)
Return up to limit
IDataService UUIDs that are
currently under-utilized. |
protected boolean |
isHighlyUtilizedDataService(ServiceScore score,
ServiceScore[] scores) |
boolean |
isHighlyUtilizedDataService(UUID serviceUUID)
Return
true if the service is considered to be "highly
utilized". |
boolean |
isOpen()
Return
true iff the service is running. |
protected boolean |
isUnderUtilizedDataService(ServiceScore score,
ServiceScore[] scores) |
boolean |
isUnderUtilizedDataService(UUID serviceUUID)
Return
true if the service is considered to be
"under-utilized". |
void |
join(UUID serviceUUID,
Class serviceIface,
String hostname)
Notify the
LoadBalancerService that a new service is available. |
void |
leave(UUID serviceUUID)
Notify the
LoadBalancerService that a service is no longer
available. |
void |
logCounters()
Logs the counters on a file created using
File.createTempFile(String, String, File) in the log
directory. |
protected void |
logCounters(File file)
Writes the counters on a file.
|
protected void |
logCounters(String basename)
Writes the counters on a file.
|
void |
notify(UUID serviceUUID,
byte[] data)
Send performance counters.
|
void |
notifyEvent(Event e)
Accepts the event, either updates the existing event with the same
UUID or adds the event to the set of recent events, and then
prunes the set of recent events so that all completed events older than
#eventHistoryMillis are discarded. |
long |
rangeCount(long fromTime,
long toTime)
Reports the #of completed events that start in the given
interval.
|
Iterator<Event> |
rangeIterator(long fromTime,
long toTime)
Visits completed events that start in the given interval in order
by their start time.
|
protected void |
setHostScores(HostScore[] a)
Normalizes the
ServiceScores and set them in place. |
protected void |
setServiceScores(ServiceScore[] a)
Normalizes the
ServiceScores and set them in place. |
void |
shutdown()
The service will no longer accept new requests, but existing requests
will be processed (sychronous).
|
void |
shutdownNow()
The service will no longer accept new requests and will make a best
effort attempt to terminate all existing requests and return ASAP.
|
void |
sighup()
Logs counters to a temp file.
|
LoadBalancerService |
start()
Starts the
AbstractService. |
void |
urgent(String msg,
UUID serviceUUID)
An urgent warning issued the caller is in immediate danger of depleting
its resources with a consequence of immediate service and/or host
failure(s).
|
void |
warn(String msg,
UUID serviceUUID)
A warning issued by a client when it is in danger of depleting its
resources.
|
clearLoggingContext, getFederation, getHostname, getServiceName, getServiceUUID, setServiceUUID, setupLoggingContextclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetHostname, getServiceName, getServiceUUIDprotected static final org.apache.log4j.Logger log
protected final String ps
protected final long serviceJoinTimeout
protected final ReentrantLock lock
protected final Condition joined
protected ConcurrentHashMap<String,HostScore> activeHosts
protected ConcurrentHashMap<UUID,ServiceScore> activeDataServices
protected AtomicReference<HostScore[]> hostScores
This array is initially null and gets updated periodically
by the LoadBalancerService.UpdateTask. The main consumer of this information is the
logic in LoadBalancerService.UpdateTask that computes the service utilization.
protected AtomicReference<ServiceScore[]> serviceScores
This array is initially null and gets updated periodically
by the LoadBalancerService.UpdateTask. The methods that report service utilization
and under-utilized services are all based on the data in this array.
Since services can leave at any time, that logic MUST also test for
existence of the service in activeDataServices before assuming that the
service is still live.
protected long nupdates
LoadBalancerService.UpdateTasks which have run so far.protected final long initialRoundRobinUpdateCount
getUnderUtilizedDataServices(int, int, UUID) will apply a round
robin policy.protected final File logDir
CounterSets
and Events.LoadBalancerService.Options.LOG_DIRprotected final boolean isTransient
true iff the LBS will refrain from writing state on the
disk. This option causes the LBS to use an in memory eventStore.
In addition, it will refuse to write counter snapshots when this option
is specified.protected final ScheduledExecutorService updateService
LoadBalancerService.UpdateTask.protected final int historyMinutes
LoadBalancerService.UpdateTask updates the HostScores and the
ServiceScores.protected final Journal eventStore
protected final EventReceiver eventReceiver
public LoadBalancerService(Properties properties)
properties - See LoadBalancerService.Optionspublic Properties getProperties()
protected abstract String getClientHostname()
public LoadBalancerService start()
AbstractServiceAbstractService.
Note: A AbstractService.start() is required in order to give subclasses an
opportunity to be fully initialized before they are required to begin
operations. It is impossible to encapsulate the startup logic cleanly
without this ctor() + start() pattern. Those familiar with Objective-C
will recognized this.
start in class AbstractServicepublic boolean isOpen()
IServiceShutdowntrue iff the service is running.isOpen in interface IServiceShutdownpublic void shutdown()
IServiceShutdownIServiceShutdown.Options.SHUTDOWN_TIMEOUT. Implementations SHOULD be
synchronized. If the service is aleady shutdown, then
this method should be a NOP.shutdown in interface IServiceShutdownshutdown in class AbstractServicepublic void shutdownNow()
IServiceShutdownshutdownNow in interface IServiceShutdownshutdownNow in class AbstractServicepublic void destroy()
IServiceDestroyAdmin#destroy().destroy in interface IServicedestroy in class AbstractServicepublic final Class getServiceIface()
ILoadBalancerService.getServiceIface in interface IServicegetServiceIface in class AbstractServiceprotected void setHostScores(HostScore[] a)
ServiceScores and set them in place.a - The new service scores.protected void setServiceScores(ServiceScore[] a)
ServiceScores and set them in place.a - The new service scores.protected void logCounters(String basename)
basename - The basename of the file. The file will be written in the
logDir.protected void logCounters(File file)
file - The file. If the file exists it will be overwritten.public void logCounters()
throws IOException
File.createTempFile(String, String, File) in the log
directory.IOExceptionpublic void sighup()
throws IOException
ILoadBalancerServicesighup in interface ILoadBalancerServiceIOExceptionpublic void join(UUID serviceUUID, Class serviceIface, String hostname)
LoadBalancerService that a new service is available.
Note: Embedded services must invoke this method directly when they start up.
Note: Distributed services implementations MUST discover services using a framework, such as jini, and invoke this method the first time a given service is discovered.
serviceUUID - serviceIface - hostname - IFederationDelegate.serviceJoin(IService, UUID),
#leave(String, UUID)public void leave(UUID serviceUUID)
LoadBalancerService that a service is no longer
available.
Note: Embedded services must invoke this method directly when they shut down.
Note: Distributed services implementations MUST discover services using a framework, such as jini, and invoke this method when a service is no longer registered.
serviceUUID - The service UUID.IFederationDelegate.serviceLeave(UUID),
join(UUID, Class, String)public void notifyEvent(Event e) throws IOException
UUID or adds the event to the set of recent events, and then
prunes the set of recent events so that all completed events older than
#eventHistoryMillis are discarded.notifyEvent in interface IEventReceivingServiceIOExceptionEventReceiverpublic Iterator<Event> rangeIterator(long fromTime, long toTime)
rangeIterator in interface IEventReportingServicepublic long rangeCount(long fromTime,
long toTime)
rangeCount in interface IEventReportingServicefromTime - The first start time to be included.toTime - The first start time to be excluded.public void notify(UUID serviceUUID, byte[] data)
ILoadBalancerServicenotify in interface ILoadBalancerServiceserviceUUID - The service UUID that is self-reporting.data - The serialized performance counter data.public void warn(String msg, UUID serviceUUID)
ILoadBalancerServicewarn in interface ILoadBalancerServicemsg - A message.serviceUUID - The service UUID that is self-reporting.public void urgent(String msg, UUID serviceUUID)
ILoadBalancerServiceurgent in interface ILoadBalancerServicemsg - A message.serviceUUID - The service UUID that is self-reporting.public boolean isHighlyUtilizedDataService(UUID serviceUUID) throws IOException
ILoadBalancerServicetrue if the service is considered to be "highly
utilized".
Note: This is used mainly to decide when a service should attempt to shed index partitions. This implementation SHOULD reflect the relative rank of the service among all services as well as its absolute load.
isHighlyUtilizedDataService in interface ILoadBalancerServiceserviceUUID - The service UUID.true if the service is considered to be "highly
utilized".IOExceptionpublic boolean isUnderUtilizedDataService(UUID serviceUUID) throws IOException
ILoadBalancerServicetrue if the service is considered to be
"under-utilized".isUnderUtilizedDataService in interface ILoadBalancerServiceserviceUUID - The service UUID.true if the service is considered to be "under-utilized".IOExceptionprotected boolean isHighlyUtilizedDataService(ServiceScore score, ServiceScore[] scores)
protected boolean isUnderUtilizedDataService(ServiceScore score, ServiceScore[] scores)
public UUID getUnderUtilizedDataService() throws IOException, TimeoutException, InterruptedException
ILoadBalancerServiceUUID of an under-utilized data service. If there is no
under-utilized service, then return the UUID of the service with
the least load.getUnderUtilizedDataService in interface ILoadBalancerServiceTimeoutException - if there are no data services and a timeout occurs while
awaiting a service join.InterruptedException - if the request is interrupted.IOExceptionpublic UUID[] getUnderUtilizedDataServices(int minCount, int maxCount, UUID exclude) throws IOException, TimeoutException, InterruptedException
ILoadBalancerServiceIDataService UUIDs that are
currently under-utilized.
When minCount is positive, this method will always return at
least minCount service UUIDs, however the UUIDs
returned MAY contain duplicates if the LoadBalancerService has a
strong preference for allocating load to some services (or for NOT
allocating load to other services). Further, the
LoadBalancerService MAY choose (or be forced to choose) to return
UUIDs for services that are within a nominal utilization range,
or even UUIDs for services that are highly-utilized if it could
otherwise not satisify the request.
getUnderUtilizedDataServices in interface ILoadBalancerServiceminCount - The minimum #of services UUIDs to return -or- zero
(0) if there is no minimum limit.maxCount - The maximum #of services UUIDs to return -or- zero
(0) if there is no maximum limit.exclude - The optional UUID of a data service to be excluded
from the returned set.null IFF no services are recommended at this time
as needing additional load.TimeoutException - if there are no data services, or if there is only a single
data service and it is excluded by the request, and a timeout
occurs while awaiting a service join.InterruptedException - if the request is interrupted.IOExceptionCopyright © 2006–2019 SYSTAP, LLC DBA Blazegraph. All rights reserved.