public class WORMStrategy extends AbstractBufferStrategy implements IDiskBasedStrategy, IHABufferStrategy, IBackingReader
Writes are buffered in a write cache. The cache is flushed when it would overflow. As a result only large sequential writes are performed on the store. Reads read through the write cache for consistency.
Note: This is used to realize both the BufferMode.Disk
and the
BufferMode.Temporary
BufferMode
s. When configured for the
BufferMode.Temporary
mode: the root blocks will not be written onto
the disk, writes will not be forced, and the backing file will be created the
first time the DiskOnlyStrategy
attempts to write through to the
disk. For many scenarios, the backing file will never be created unless the
write cache overflows. This provides very low latency on start-up, the same
MRMW capability, and allows very large temporary stores.
FIXME Examine behavior when write caching is enabled/disabled for the OS.
This has a profound impact. Asynchronous writes of multiple buffers, and the
use of smaller buffers, may be absolutely when the write cache is disabled.
It may be that swapping sets in because the Windows write cache is being
overworked, in which case doing incremental and async IO would help. Compare
with behavior on server platforms. See
http://support.microsoft.com/kb/259716,
http://www.accucadd.com/TechNotes/Cache/WriteBehindCache.htm,
http://msdn2.microsoft.com/en-us/library/aa365165.aspx,
http://www.jasonbrome.com/blog/archives/2004/04/03/writecache_enabled.html,
http://support.microsoft.com/kb/811392,
http://mail-archives.apache.org/mod_mbox
/db-derby-dev/200609.mbox/%3C44F820A8.6000000@sun.com%3E
/sbin/hdparm -W 0 /dev/hda 0 Disable write caching /sbin/hdparm -W 1 /dev/hda 1 Enable write caching
BufferMode.Disk
,
BufferMode.Temporary
Modifier and Type | Class and Description |
---|---|
static class |
WORMStrategy.StoreCounters<T extends WORMStrategy.StoreCounters<T>>
Striped performance counters for
IRawStore access, including
operations that read or write through to the underlying media. |
static class |
WORMStrategy.WormStoreState |
bufferMode, commitOffset, ERR_ADDRESS_IS_NULL, ERR_ADDRESS_NOT_WRITTEN, ERR_BAD_RECORD_SIZE, ERR_BUFFER_EMPTY, ERR_BUFFER_NULL, ERR_BUFFER_OVERRUN, ERR_MAX_EXTENT, ERR_NOT_OPEN, ERR_OPEN, ERR_READ_ONLY, ERR_RECORD_LENGTH_ZERO, ERR_TRUNCATE, initialExtent, log, maximumExtent, nextOffset, WARN
am
NULL
Modifier and Type | Method and Description |
---|---|
void |
abort()
Resets the
WriteCacheService (if enabled). |
void |
close()
Closes the file immediately (without flushing any pending writes).
|
void |
closeForWrites()
Extended to reset the write cache.
|
void |
commit()
A method that removes assumptions of how a specific strategy commits
data.
|
void |
computeDigest(Object snapshot,
MessageDigest digest)
Compute the digest of the entire backing store (including the magic, file
version, root blocks, etc).
|
void |
delete(long addr)
This implementation can not release storage allocations and invocations
of this method are ignored.
|
void |
deleteResources()
Deletes the backing file(s) (if any).
|
void |
force(boolean metadata)
Force the data to stable storage.
|
long |
getBlockSequence()
Return the #of
WriteCache blocks that were written out for the
last write set. |
FileChannel |
getChannel()
Note: This MAY be
null . |
CounterSet |
getCounters()
Return interesting information about the write cache and file operations.
|
long |
getCurrentBlockSequence()
Return the then-current write cache block sequence.
|
long |
getExtent()
The current size of the journal in bytes.
|
File |
getFile()
The backing file.
|
int |
getHeaderSize()
The size of the file header in bytes.
|
protected long |
getMinimumExtension()
Overridden to use the value specified to the constructor.
|
RandomAccessFile |
getRandomAccessFile()
Note: This MAY be
null . |
WORMStrategy.StoreCounters<?> |
getStoreCounters()
Returns the striped performance counters for the store.
|
StoreState |
getStoreState()
A StoreState object references critical transient data that can be used
to determine a degree of consistency between stores, specifically for an
HA context.
|
long |
getUserExtent()
The size of the user data extent in bytes.
|
com.bigdata.journal.WORMStrategy.WORMWriteCacheService |
getWriteCacheService()
Return the
WriteCacheService (mainly for debugging). |
boolean |
isFullyBuffered()
True iff the store is fully buffered (all reads are against memory).
|
boolean |
isStable()
True iff backed by stable storage.
|
void |
postHACommit(IRootBlockView rootBlock)
Provides a trigger for synchronization of transient state after a commit.
|
ByteBuffer |
read(long addr)
Extended to handle
ChecksumError s by reading on another node when
the Quorum (iff the quorum is highly available). |
ByteBuffer |
readFromLocalStore(long addr)
Read from the local store in support of failover reads on nodes in a
highly available
Quorum . |
ByteBuffer |
readRaw(long offset,
ByteBuffer dst)
Read on the backing file.
|
ByteBuffer |
readRootBlock(boolean rootBlock0)
Read the specified root block from the backing file.
|
void |
resetFromHARootBlock(IRootBlockView rootBlock)
Reload from the current root block - CAUTION : THIS IS NOT A
RESET / ABORT.
|
Future<Void> |
sendHALogBuffer(IHALogRequest req,
IHAWriteMessage msg,
IBufferAccess b)
Send an
IHAWriteMessage and the associated raw buffer through the
write pipeline. |
Future<Void> |
sendRawBuffer(IHARebuildRequest req,
long sequence,
long quorumToken,
long fileExtent,
long offset,
int nbytes,
ByteBuffer b)
Send an
IHAWriteMessage and the associated raw buffer through the
write pipeline. |
void |
setExtentForLocalStore(long extent)
Extend local store for a highly available
Quorum . |
void |
setStoreCounters(WORMStrategy.StoreCounters<?> storeCounters)
Replaces the
WORMStrategy.StoreCounters object. |
Object |
snapshotAllocators()
Snapshot the allocators in preparation for computing a digest of the
committed allocations.
|
long |
transferTo(RandomAccessFile out)
A block operation that transfers the serialized records (aka the written
on portion of the user extent) en mass from the buffer onto an output
file.
|
void |
truncate(long newExtent)
Either truncates or extends the journal.
|
boolean |
useChecksums()
|
long |
write(ByteBuffer data)
Write the data (unisolated).
|
void |
writeOnStream(OutputStream os,
AbstractJournal.ISnapshotData snapshotData,
Quorum<HAGlue,QuorumService<HAGlue>> quorum,
long token)
Write a consistent snapshot of the committed state of the backing store.
|
void |
writeRawBuffer(HARebuildRequest req,
IHAWriteMessage msg,
ByteBuffer transfer)
Used to support the rebuild protocol
|
void |
writeRawBuffer(IHAWriteMessage msg,
IBufferAccess b)
Write a buffer containing data replicated from the master onto the local
persistence store.
|
void |
writeRootBlock(IRootBlockView rootBlock,
ForceEnum forceOnCommit)
Write the root block onto stable storage (ie, flush it through to disk).
|
assertOpen, destroy, getBufferMode, getInitialExtent, getMaximumExtent, getMaxRecordSize, getMetaBitsAddr, getMetaStartAddr, getNextOffset, getResourceMetadata, getUUID, isDirty, isOpen, isReadOnly, overflow, requiresCommit, size, transferFromDiskTo
getAddressManager, getByteCount, getOffset, getOffsetBits, getPhysicalAddress, toAddr, toString
getInputStream, getOutputStream
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getAddressManager, getBufferMode, getInitialExtent, getMaximumExtent, getMaxRecordSize, getMetaBitsAddr, getMetaStartAddr, getNextOffset, getOffsetBits, isDirty, requiresCommit
destroy, getResourceMetadata, getUUID, isOpen, isReadOnly, size
getByteCount, getOffset, getPhysicalAddress, toAddr, toString
getInputStream, getOutputStream
public com.bigdata.journal.WORMStrategy.WORMWriteCacheService getWriteCacheService()
IHABufferStrategy
WriteCacheService
(mainly for debugging).getWriteCacheService
in interface IHABufferStrategy
public boolean useChecksums()
AbstractBufferStrategy
useChecksums
in interface IBufferStrategy
useChecksums
in class AbstractBufferStrategy
public final int getHeaderSize()
IDiskBasedStrategy
getHeaderSize
in interface IBufferStrategy
getHeaderSize
in interface IDiskBasedStrategy
public final File getFile()
IDiskBasedStrategy
getFile
in interface IDiskBasedStrategy
getFile
in interface IRawStore
public final RandomAccessFile getRandomAccessFile()
null
. If BufferMode.Temporary
is used then it WILL be null
until the #writeCache
is flushed to disk for the first time.getRandomAccessFile
in interface IDiskBasedStrategy
public final FileChannel getChannel()
null
. If BufferMode.Temporary
is used then it WILL be null
until the #writeCache
is flushed to disk for the first time.getChannel
in interface IDiskBasedStrategy
public WORMStrategy.StoreCounters<?> getStoreCounters()
public void setStoreCounters(WORMStrategy.StoreCounters<?> storeCounters)
WORMStrategy.StoreCounters
object.storeCounters
- The new BTree.Counter
s.IllegalArgumentException
- if the argument is null
.public CounterSet getCounters()
getCounters
in interface ICounterSetAccess
getCounters
in interface IBufferStrategy
public final boolean isStable()
IRawStore
public boolean isFullyBuffered()
IRawStore
Note: This does not guarantee that the OS will not swap the buffer onto disk.
isFullyBuffered
in interface IRawStore
public void force(boolean metadata)
IRawStore
public void commit()
This implementation flushes the write cache (if enabled).
commit
in interface IBufferStrategy
commit
in class AbstractBufferStrategy
public long getBlockSequence()
IHABufferStrategy
WriteCache
blocks that were written out for the
last write set. This is used to communicate the #of write cache blocks in
the commit point back to AbstractJournal.commitNow(long)
. It is
part of the commit protocol.
Note: This DOES NOT reflect the current value of the block sequence
counter for ongoing writes. That counter is owned by the
WriteCacheService
.
getBlockSequence
in interface IHABufferStrategy
WriteCacheService.resetSequence()
,
FIXME I would prefer to expose the {@link WriteCacheService} to the
{@link AbstractJournal} and let it directly invoke
{@link WriteCacheService#resetSequence()}. The current pattern
requires the {@link IHABufferStrategy} implementations to track the
lastBlockSequence and is messy.
public long getCurrentBlockSequence()
IHABufferStrategy
getCurrentBlockSequence
in interface IHABufferStrategy
IHABufferStrategy.getBlockSequence()
public void abort()
WriteCacheService
(if enabled).
Note: This assumes the caller is synchronized appropriately otherwise writes belonging to other threads will be discarded from the cache!
abort
in interface IBufferStrategy
abort
in class AbstractBufferStrategy
public void close()
close
in interface IRawStore
close
in class AbstractBufferStrategy
public void deleteResources()
IRawStore
deleteResources
in interface IRawStore
public final long getExtent()
IBufferStrategy
Options.INITIAL_EXTENT
.getExtent
in interface IBufferStrategy
public final long getUserExtent()
IBufferStrategy
Note: The size of the user extent is always generally smaller than the
value reported by IBufferStrategy.getExtent()
since the latter also reports the
space allocated to the journal header and root blocks.
getUserExtent
in interface IBufferStrategy
public ByteBuffer read(long addr)
ChecksumError
s by reading on another node when
the Quorum
(iff the quorum is highly available).
Read the data (unisolated).
read
in interface IRawStore
addr
- A long integer that encodes both the offset from which the
data will be read and the #of bytes to be read. See
IAddressManager.toAddr(int, long)
.public ByteBuffer readFromLocalStore(long addr) throws InterruptedException
Quorum
.
This implementation tests the WriteCacheService
first
and then reads through to the local disk on a cache miss. This is
automatically invoked by read(long)
.
readFromLocalStore
in interface IHABufferStrategy
InterruptedException
public ByteBuffer readRaw(long offset, ByteBuffer dst)
Buffer.remaining()
bytes will be
read into the caller's buffer, starting at the specified offset in the
backing file.readRaw
in interface IBackingReader
readRaw
in interface IHABufferStrategy
offset
- The offset of the first byte (now absolute, not relative to the
start of the data region).dst
- Where to put the data. Bytes will be written at position until
limit.public long write(ByteBuffer data)
IRawStore
write
in interface IRawStore
data
- The data. The bytes from the current
Buffer.position()
to the
Buffer.limit()
will be written and the
Buffer.position()
will be advanced to the
Buffer.limit()
. The caller may subsequently
modify the contents of the buffer without changing the state
of the store (i.e., the data are copied into the store).IAddressManager
.protected long getMinimumExtension()
getMinimumExtension
in class AbstractBufferStrategy
public ByteBuffer readRootBlock(boolean rootBlock0)
IBufferStrategy
readRootBlock
in interface IBufferStrategy
public void writeRootBlock(IRootBlockView rootBlock, ForceEnum forceOnCommit)
IBufferStrategy
writeRootBlock
in interface IBufferStrategy
rootBlock
- The root block. Which root block is indicated by
IRootBlockView.isRootBlock0()
.public void truncate(long newExtent)
IBufferStrategy
Note: Implementations of this method MUST be synchronized so that the operation is atomic with respect to concurrent writers.
truncate
in interface IBufferStrategy
newExtent
- The new extent of the journal. This value represent the total
extent of the journal, including any root blocks together with
the user extent.public long transferTo(RandomAccessFile out) throws IOException
IBufferStrategy
Note: Implementations of this method MUST be synchronized so that the operation is atomic with respect to concurrent writers.
transferTo
in interface IBufferStrategy
out
- The file to which the buffer contents will be transferred.IOException
public void closeForWrites()
Note: The file is NOT closed and re-opened in a read-only mode in order to avoid causing difficulties for concurrent readers.
closeForWrites
in interface IBufferStrategy
closeForWrites
in class AbstractBufferStrategy
public void delete(long addr)
delete
in interface IRawStore
delete
in class AbstractBufferStrategy
addr
- A long integer formed using Addr
that encodes both the
offset at which the data was written and the #of bytes that
were written.public void writeRawBuffer(IHAWriteMessage msg, IBufferAccess b) throws IOException, InterruptedException
IHABufferStrategy
writeRawBuffer
in interface IHABufferStrategy
IOException
InterruptedException
public Future<Void> sendHALogBuffer(IHALogRequest req, IHAWriteMessage msg, IBufferAccess b) throws IOException, InterruptedException
IHABufferStrategy
IHAWriteMessage
and the associated raw buffer through the
write pipeline.sendHALogBuffer
in interface IHABufferStrategy
req
- The IHALogRequest
for some HALog file.msg
- The IHAWriteMessage
.b
- The raw buffer. Bytes from position to limit will be sent.
remaining() must equal IHAWriteMessageBase.getSize()
.Future
for that request.IOException
InterruptedException
public Future<Void> sendRawBuffer(IHARebuildRequest req, long sequence, long quorumToken, long fileExtent, long offset, int nbytes, ByteBuffer b) throws IOException, InterruptedException
IHABufferStrategy
IHAWriteMessage
and the associated raw buffer through the
write pipeline.sendRawBuffer
in interface IHABufferStrategy
req
- The IHARebuildRequest
to replicate the backing file to
the requesting service.sequence
- The sequence of this IHAWriteMessage
(origin ZERO
(0)).quorumToken
- The quorum token of the leader, which must remain valid across
the rebuild protocol.fileExtent
- The file extent as of the moment that the leader begins to
replicate the existing backing file.offset
- The starting offset (relative to the root blocks).nbytes
- The #of bytes to be sent.b
- The raw buffer. The buffer will be cleared and filled with the
specified data, then sent down the write pipeline.Future
for that request.IOException
InterruptedException
public void writeOnStream(OutputStream os, AbstractJournal.ISnapshotData snapshotData, Quorum<HAGlue,QuorumService<HAGlue>> quorum, long token) throws IOException, QuorumException
IHABufferStrategy
Note: The caller is able to obtain both root blocks atomically, while the strategy may not be aware of the root blocks or may not be able to coordinate their atomic capture.
Note: The caller must ensure that the resulting snapshot will be consistent either by ensuring that no writes occur or by taking a read-lock that will prevent overwrites of committed state during this operation.
writeOnStream
in interface IHABufferStrategy
os
- Where to write the data.quorum
- The Quorum
.token
- The token that must remain valid during this operation.IOException
QuorumException
- if the service is not joined with the met quorum for that
token at any point during the operation.public void setExtentForLocalStore(long extent) throws IOException, InterruptedException
IHABufferStrategy
Quorum
.setExtentForLocalStore
in interface IHABufferStrategy
InterruptedException
IOException
public void resetFromHARootBlock(IRootBlockView rootBlock)
IHABufferStrategy
Note: This method is used when the root blocks of the leader are
installed onto a follower. This can change the UUID
for the
backing store file. The IHABufferStrategy
implementation MUST
update any cached value for that UUID
.
Use IHABufferStrategy.postHACommit(IRootBlockView)
rather than this method in the
2-phase commit on the follower.
resetFromHARootBlock
in interface IHABufferStrategy
public void postHACommit(IRootBlockView rootBlock)
IHABufferStrategy
For the RWStore this is used to resynchronize the allocators during the 2-phase commit on the follower with the delta in the allocators from the write set associated with that commit.
postHACommit
in interface IHABufferStrategy
rootBlock
- The newly installed root block.public Object snapshotAllocators()
IHABufferStrategy
snapshotAllocators
in interface IHABufferStrategy
null
if the snapshot is a NOP for the
IBufferStrategy
(e.g., for the WORM).public void computeDigest(Object snapshot, MessageDigest digest) throws DigestException, IOException
IHABufferStrategy
Note: The digest is not reliable unless you either use a snapshot or suspend writes (on the quorum) while it is computed.
computeDigest
in interface IHABufferStrategy
snapshot
- The allocator snapshot (optional). When given, the digest is
computed only for the snapshot. When null
it is
computed for the entire file.DigestException
IOException
public void writeRawBuffer(HARebuildRequest req, IHAWriteMessage msg, ByteBuffer transfer) throws IOException
IHABufferStrategy
writeRawBuffer
in interface IHABufferStrategy
IOException
public StoreState getStoreState()
IHABufferStrategy
getStoreState
in interface IHABufferStrategy
Copyright © 2006–2019 SYSTAP, LLC DBA Blazegraph. All rights reserved.