public abstract class LoadBalancerService extends AbstractService implements ILoadBalancerService, IServiceShutdown, IEventReportingService
LoadBalancerService
collects a variety of performance counters
from hosts and services, identifies over- and under- utilized hosts and
services based on the collected data and reports those to DataService
s so that they can auto-balance, and acts as a clearing house for WARN and
URGENT alerts for hosts and services.
While the LoadBalancerService
MAY observe service start/stop events,
it does NOT get directly informed of actions that change the load
distribution, such as index partition moves or reading from a failover
service. Instead, DataService
s determine whether or not they are
overloaded and, if so, query the LoadBalancerService
for the identity
of under-utilized services. If under-utilized DataService
s are
reported by the LoadBalancerService
then the DataService
will
self-identify index partitions to be shed and move them onto the identified
under-utilized DataService
s. The LoadBalancerService
learns
of these actions solely through their effect on host and service load as
self- reported by various services.
Note: utilization should be defined in terms of transient system resources : CPU, IO (DISK and NET), RAM. DISK exhaustion on the other hand is the basis for WARN or URGENT alerts since it can lead to immediate failure of all services on the same host.
Note: When new services are made available, either on new hosts or on the
existing hardware, and service utilization discrepancies should become
rapidly apparent (within a few minutes). Once we have collected performance
counters for the new hosts / services, a subsequent overflow event(s) on
existing DataService
(s) will cause index partition moves to be
nominated targeting the new hosts and services. The amount of time that it
takes to re-balance the load on the services will depend in part on the write
rate, since writes drive overflow events and index partition splits, both of
which lead to pre-conditions for index partition moves.
Note: If a host is suffering high IOWAIT then it is probably "hot for read" (writes are heavily buffered and purely sequential and therefore unlikely to cause high IOWAIT where as reads are typically random on journals even through a key range scan is sequential on index segments). Therefore a "hot for read" condition should be addressed by increasing the replication count for those service(s) which are being swamped by read requests on a host suffering from high IOWAIT.
IRequiredHostCounters, A core set of variables to support
decision-making.
,
http://www.google.com/search?hl=en&q=load+balancing+jini
Modifier and Type | Class and Description |
---|---|
static interface |
LoadBalancerService.Options
Options understood by the
LoadBalancerService . |
protected class |
LoadBalancerService.RoundRobinServiceLoadHelper
Integration with the
LoadBalancerService . |
protected class |
LoadBalancerService.ServiceLoadHelperWithoutScores
Integration with the
LoadBalancerService . |
protected class |
LoadBalancerService.ServiceLoadHelperWithScores
Integration with the
LoadBalancerService . |
protected class |
LoadBalancerService.UpdateTask
Computes and updates the
ServiceScore s based on an examination
of aggregated performance counters. |
Modifier and Type | Field and Description |
---|---|
protected ConcurrentHashMap<UUID,ServiceScore> |
activeDataServices
The set of active services.
|
protected ConcurrentHashMap<String,HostScore> |
activeHosts
The active hosts (one or more services).
|
protected EventReceiver |
eventReceiver |
protected Journal |
eventStore
Used to persist the logged events.
|
protected int |
historyMinutes
The #of minutes of history that will be smoothed into an average when
LoadBalancerService.UpdateTask updates the HostScore s and the
ServiceScore s. |
protected AtomicReference<HostScore[]> |
hostScores
Scores for the hosts in ascending order (least utilized to most
utilized).
|
protected long |
initialRoundRobinUpdateCount
The #of updates during which
getUnderUtilizedDataServices(int, int, UUID) will apply a round
robin policy. |
protected boolean |
isTransient
true iff the LBS will refrain from writing state on the
disk. |
protected Condition |
joined
Used to await a service join when there are no services.
|
protected ReentrantLock |
lock
Lock is used to control access to data structures that are not
thread-safe.
|
protected static org.apache.log4j.Logger |
log |
protected File |
logDir
The directory in which the service will log the
CounterSet s
and Event s. |
protected long |
nupdates
The #of
LoadBalancerService.UpdateTask s which have run so far. |
protected String |
ps |
protected long |
serviceJoinTimeout
Service join timeout in milliseconds - used when we need to wait for a
service to join before we can recommend an under-utilized service.
|
protected AtomicReference<ServiceScore[]> |
serviceScores
Scores for the services in ascending order (least utilized to most
utilized).
|
protected ScheduledExecutorService |
updateService
Runs a periodic
LoadBalancerService.UpdateTask . |
Constructor and Description |
---|
LoadBalancerService(Properties properties)
Note: The load balancer MUST NOT collect host statistics unless it is the
only service running on that host.
|
Modifier and Type | Method and Description |
---|---|
void |
destroy()
Destroy the service.
|
protected void |
finalized() |
protected abstract String |
getClientHostname()
Return the canonical hostname of the client in the context of a RMI
request.
|
Properties |
getProperties()
An object wrapping the properties provided to the constructor.
|
Class |
getServiceIface()
Returns
ILoadBalancerService . |
UUID |
getUnderUtilizedDataService()
Return the
UUID of an under-utilized data service. |
UUID[] |
getUnderUtilizedDataServices(int minCount,
int maxCount,
UUID exclude)
Return up to limit
IDataService UUID s that are
currently under-utilized. |
protected boolean |
isHighlyUtilizedDataService(ServiceScore score,
ServiceScore[] scores) |
boolean |
isHighlyUtilizedDataService(UUID serviceUUID)
Return
true if the service is considered to be "highly
utilized". |
boolean |
isOpen()
Return
true iff the service is running. |
protected boolean |
isUnderUtilizedDataService(ServiceScore score,
ServiceScore[] scores) |
boolean |
isUnderUtilizedDataService(UUID serviceUUID)
Return
true if the service is considered to be
"under-utilized". |
void |
join(UUID serviceUUID,
Class serviceIface,
String hostname)
Notify the
LoadBalancerService that a new service is available. |
void |
leave(UUID serviceUUID)
Notify the
LoadBalancerService that a service is no longer
available. |
void |
logCounters()
Logs the counters on a file created using
File.createTempFile(String, String, File) in the log
directory. |
protected void |
logCounters(File file)
Writes the counters on a file.
|
protected void |
logCounters(String basename)
Writes the counters on a file.
|
void |
notify(UUID serviceUUID,
byte[] data)
Send performance counters.
|
void |
notifyEvent(Event e)
Accepts the event, either updates the existing event with the same
UUID or adds the event to the set of recent events, and then
prunes the set of recent events so that all completed events older than
#eventHistoryMillis are discarded. |
long |
rangeCount(long fromTime,
long toTime)
Reports the #of completed events that start in the given
interval.
|
Iterator<Event> |
rangeIterator(long fromTime,
long toTime)
Visits completed events that start in the given interval in order
by their start time.
|
protected void |
setHostScores(HostScore[] a)
Normalizes the
ServiceScore s and set them in place. |
protected void |
setServiceScores(ServiceScore[] a)
Normalizes the
ServiceScore s and set them in place. |
void |
shutdown()
The service will no longer accept new requests, but existing requests
will be processed (sychronous).
|
void |
shutdownNow()
The service will no longer accept new requests and will make a best
effort attempt to terminate all existing requests and return ASAP.
|
void |
sighup()
Logs counters to a temp file.
|
LoadBalancerService |
start()
Starts the
AbstractService . |
void |
urgent(String msg,
UUID serviceUUID)
An urgent warning issued the caller is in immediate danger of depleting
its resources with a consequence of immediate service and/or host
failure(s).
|
void |
warn(String msg,
UUID serviceUUID)
A warning issued by a client when it is in danger of depleting its
resources.
|
clearLoggingContext, getFederation, getHostname, getServiceName, getServiceUUID, setServiceUUID, setupLoggingContext
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getHostname, getServiceName, getServiceUUID
protected static final org.apache.log4j.Logger log
protected final String ps
protected final long serviceJoinTimeout
protected final ReentrantLock lock
protected final Condition joined
protected ConcurrentHashMap<String,HostScore> activeHosts
protected ConcurrentHashMap<UUID,ServiceScore> activeDataServices
protected AtomicReference<HostScore[]> hostScores
This array is initially null
and gets updated periodically
by the LoadBalancerService.UpdateTask
. The main consumer of this information is the
logic in LoadBalancerService.UpdateTask
that computes the service utilization.
protected AtomicReference<ServiceScore[]> serviceScores
This array is initially null
and gets updated periodically
by the LoadBalancerService.UpdateTask
. The methods that report service utilization
and under-utilized services are all based on the data in this array.
Since services can leave at any time, that logic MUST also test for
existence of the service in activeDataServices
before assuming that the
service is still live.
protected long nupdates
LoadBalancerService.UpdateTask
s which have run so far.protected final long initialRoundRobinUpdateCount
getUnderUtilizedDataServices(int, int, UUID)
will apply a round
robin policy.protected final File logDir
CounterSet
s
and Event
s.LoadBalancerService.Options.LOG_DIR
protected final boolean isTransient
true
iff the LBS will refrain from writing state on the
disk. This option causes the LBS to use an in memory eventStore
.
In addition, it will refuse to write counter snapshots when this option
is specified.protected final ScheduledExecutorService updateService
LoadBalancerService.UpdateTask
.protected final int historyMinutes
LoadBalancerService.UpdateTask
updates the HostScore
s and the
ServiceScore
s.protected final Journal eventStore
protected final EventReceiver eventReceiver
public LoadBalancerService(Properties properties)
properties
- See LoadBalancerService.Options
public Properties getProperties()
protected abstract String getClientHostname()
public LoadBalancerService start()
AbstractService
AbstractService
.
Note: A AbstractService.start()
is required in order to give subclasses an
opportunity to be fully initialized before they are required to begin
operations. It is impossible to encapsulate the startup logic cleanly
without this ctor() + start() pattern. Those familiar with Objective-C
will recognized this.
start
in class AbstractService
public boolean isOpen()
IServiceShutdown
true
iff the service is running.isOpen
in interface IServiceShutdown
public void shutdown()
IServiceShutdown
IServiceShutdown.Options.SHUTDOWN_TIMEOUT
. Implementations SHOULD be
synchronized. If the service is aleady shutdown, then
this method should be a NOP.shutdown
in interface IServiceShutdown
shutdown
in class AbstractService
public void shutdownNow()
IServiceShutdown
shutdownNow
in interface IServiceShutdown
shutdownNow
in class AbstractService
public void destroy()
IService
DestroyAdmin#destroy()
.destroy
in interface IService
destroy
in class AbstractService
public final Class getServiceIface()
ILoadBalancerService
.getServiceIface
in interface IService
getServiceIface
in class AbstractService
protected void setHostScores(HostScore[] a)
ServiceScore
s and set them in place.a
- The new service scores.protected void setServiceScores(ServiceScore[] a)
ServiceScore
s and set them in place.a
- The new service scores.protected void logCounters(String basename)
basename
- The basename of the file. The file will be written in the
logDir
.protected void logCounters(File file)
file
- The file. If the file exists it will be overwritten.public void logCounters() throws IOException
File.createTempFile(String, String, File)
in the log
directory.IOException
public void sighup() throws IOException
ILoadBalancerService
sighup
in interface ILoadBalancerService
IOException
public void join(UUID serviceUUID, Class serviceIface, String hostname)
LoadBalancerService
that a new service is available.
Note: Embedded services must invoke this method directly when they start up.
Note: Distributed services implementations MUST discover services using a framework, such as jini, and invoke this method the first time a given service is discovered.
serviceUUID
- serviceIface
- hostname
- IFederationDelegate.serviceJoin(IService, UUID)
,
#leave(String, UUID)
public void leave(UUID serviceUUID)
LoadBalancerService
that a service is no longer
available.
Note: Embedded services must invoke this method directly when they shut down.
Note: Distributed services implementations MUST discover services using a framework, such as jini, and invoke this method when a service is no longer registered.
serviceUUID
- The service UUID
.IFederationDelegate.serviceLeave(UUID)
,
join(UUID, Class, String)
public void notifyEvent(Event e) throws IOException
UUID
or adds the event to the set of recent events, and then
prunes the set of recent events so that all completed events older than
#eventHistoryMillis
are discarded.notifyEvent
in interface IEventReceivingService
IOException
EventReceiver
public Iterator<Event> rangeIterator(long fromTime, long toTime)
rangeIterator
in interface IEventReportingService
public long rangeCount(long fromTime, long toTime)
rangeCount
in interface IEventReportingService
fromTime
- The first start time to be included.toTime
- The first start time to be excluded.public void notify(UUID serviceUUID, byte[] data)
ILoadBalancerService
notify
in interface ILoadBalancerService
serviceUUID
- The service UUID
that is self-reporting.data
- The serialized performance counter data.public void warn(String msg, UUID serviceUUID)
ILoadBalancerService
warn
in interface ILoadBalancerService
msg
- A message.serviceUUID
- The service UUID
that is self-reporting.public void urgent(String msg, UUID serviceUUID)
ILoadBalancerService
urgent
in interface ILoadBalancerService
msg
- A message.serviceUUID
- The service UUID
that is self-reporting.public boolean isHighlyUtilizedDataService(UUID serviceUUID) throws IOException
ILoadBalancerService
true
if the service is considered to be "highly
utilized".
Note: This is used mainly to decide when a service should attempt to shed index partitions. This implementation SHOULD reflect the relative rank of the service among all services as well as its absolute load.
isHighlyUtilizedDataService
in interface ILoadBalancerService
serviceUUID
- The service UUID
.true
if the service is considered to be "highly
utilized".IOException
public boolean isUnderUtilizedDataService(UUID serviceUUID) throws IOException
ILoadBalancerService
true
if the service is considered to be
"under-utilized".isUnderUtilizedDataService
in interface ILoadBalancerService
serviceUUID
- The service UUID
.true
if the service is considered to be "under-utilized".IOException
protected boolean isHighlyUtilizedDataService(ServiceScore score, ServiceScore[] scores)
protected boolean isUnderUtilizedDataService(ServiceScore score, ServiceScore[] scores)
public UUID getUnderUtilizedDataService() throws IOException, TimeoutException, InterruptedException
ILoadBalancerService
UUID
of an under-utilized data service. If there is no
under-utilized service, then return the UUID
of the service with
the least load.getUnderUtilizedDataService
in interface ILoadBalancerService
TimeoutException
- if there are no data services and a timeout occurs while
awaiting a service join.InterruptedException
- if the request is interrupted.IOException
public UUID[] getUnderUtilizedDataServices(int minCount, int maxCount, UUID exclude) throws IOException, TimeoutException, InterruptedException
ILoadBalancerService
IDataService
UUID
s that are
currently under-utilized.
When minCount is positive, this method will always return at
least minCount service UUID
s, however the UUID
s
returned MAY contain duplicates if the LoadBalancerService
has a
strong preference for allocating load to some services (or for NOT
allocating load to other services). Further, the
LoadBalancerService
MAY choose (or be forced to choose) to return
UUID
s for services that are within a nominal utilization range,
or even UUID
s for services that are highly-utilized if it could
otherwise not satisify the request.
getUnderUtilizedDataServices
in interface ILoadBalancerService
minCount
- The minimum #of services UUID
s to return -or- zero
(0) if there is no minimum limit.maxCount
- The maximum #of services UUID
s to return -or- zero
(0) if there is no maximum limit.exclude
- The optional UUID
of a data service to be excluded
from the returned set.null
IFF no services are recommended at this time
as needing additional load.TimeoutException
- if there are no data services, or if there is only a single
data service and it is excluded by the request, and a timeout
occurs while awaiting a service join.InterruptedException
- if the request is interrupted.IOException
Copyright © 2006–2019 SYSTAP, LLC DBA Blazegraph. All rights reserved.