oStoreInfo.Imported @ jos.sf.net

This utility should be designed 'open' enough to start as a 'bare bones' design with no-caching, no-networking and no-fault tolerance - with the option to apply these sections (mentioned above) as 'pluggable' modules.

In a nutshell, every object instance has a 'key' related to it - whether it be another object instance or an integer ID. Using this 'key', object can be added, retrieved, updated and removed from the media.

From the work I have done so far, these are the sections I have divided this utility into:

StorageTracker: An interface used to retrieve locations and sizes (Mappings) of objects written to media
Mapping: An interface used to track the location and size of each stored object
StorageIO: An interface for reading and writing to media

There are four main actions that can be catered for: create, read, modify and remove. I will go through each one as I have implemented so far (in order of difficulty).

Read

Using a method such as get(long) or get(Object) (where long or Object are the 'key'), the StorageTracker will retrieve, either from disk or from memory, the Mapping for this particular 'key'. The Mapping contains the location and size of the byte array storing the 'key's object instance. The Mapping interface is the bare minimum for reading and writing data to StorageIO, but can be expanded to implement caching. I will go further into this later.

The location and size are passed to the StorageIO, which at present, is a thread-safe class, used to read and write byte arrays to disk.

Once the byte array has been read from the media, it can be converted back to its original state using AlterKit or sent across a network in its encrypted form to a client for conversion at the workstation.

If caching had been implemented, using another interface (extending the Mapping interface), an object could be checked for its 'cache status' and determined whether it is currently stored in memory.

Remove

Using a method such as remove(long) or remove(Object), the StorageTracker will retrieve the Mapping for this particular 'key', after which it would delete this entry. The location and size are passed to the StorageIO, the byte array has been read from the media. We could optionally not return anything from a remove method, but this is not a large issue as I see it. Once all reference to the storage of the object has been removed, it is treated as free space. In future we could implement a undelete/purge system, where recently deleted objects are temporarily stored for an amount of time and purged afterwards.

If caching had been implemented, the object could be checked for its 'cache status' and removed from memory if necessary.

A note, if any indexing of locations or 'still-to-be-written' status (both to be mentioned later) are used, a remove method would remove any entries.

Modify

Using AlterKit, the objects instance is converted to a byte array and its size noted. The Mapping is read and if its current size is less than or equal to its previous size, it can remain at the same location. Otherwise it is sent to the end of the file. As a possibility, if the object had been 'null', the action could be treated as a remove - but this is optional and not a large issue. Once the location has been determined, the mapping can be stored and the byte array sent to StorageIO, for writing to media.

This procedure is actually a little more complicated than this. When checking if the current size will fit in the previous size, StorageTracker will have to check where the byte array before it ends and where the next byte array begins. It is possible that the last time this byte array had been written to media, that the byte array before it has gotten smaller since then, allowing a larger size. This is where the 'location index' comes in. If an index of start locations of all byte array is implemented, before and after locations can be found much easier, rather than scanning all 'key' Mappings. Also, that last byte array location can be found easier too.

Here is an example of how the indexing would work.

Three byte arrays:

ID	Location	Size

56	12650		370

34	13020		520

98	13540		120

Now, ID 56 is modified and becomes smaller (350) leaving a 20 byte gap between 56 and 34.

Then ID 34 is modified. Its new size is 530, which if stored at its current location wouldn't fit. But there is a 20 byte gap to the previous byte array. Therefore, the starting location is changed to 12650+350 = 13000 and the byte array is stored without problem.

Once the mapping is stored, the location index could be updated accordingly.

If a 'still-to-be-written' queue was used, the byte array could be 'cached' and written to disk as a medium priority thread. Once written, the 'cache status' could change to signify it is on disk.

Create

Well actually, creating a new entry is part of the Modify procedure. If a Mapping is not found for the ID, a new Mapping is created and the last location+size is used as a new location.

The new entry is added to a location index if necessary and 'still-to-be-written' queue updated if implemented.

Other Features

Some features I have thought of implementing:

Across the network object retrieval: basically identical to a local system, but RMI interface used for actions and AlterKit implemented on client workstation.
Compacter Thread: low priority thread scans for gaps between byte arrays and relocates them accordingly.
'Still-to-be-written' queue: where newly created and modified byte arrays are stored in a queue and written to disk on a medium priority thread.
Cache Expire Thread: checks cached information for time since last access, and removes them from system.
Read Ahead: possible read ahead caching.

FAQ & Comments

(Q) exceptions

There is no oStore defined exceptions. How could the application programmer be aware of commom errors like: out of disk space, wrong file name etc Should this exceptions be declared inside oStore interfaces? Should we declare a oStoreException class?

(A) Ok, first of all, I haven't thought that far ahead yet, and I don't think we need too. Its a good question, but I think we should consider it further up the track. It is an easy task to implement these exceptions in an 'alpha' stage and as I see it, and exceptions do not affect the 'normal' running of oStore. But certainly a subject to be dealt with in the future.

(Q) object key

If a instance oStore is made persistent, where the object ids will be stored from a process execution to another? How could one tell what class belongs the object retrieved on a second process execution?

Should the object keys be stored in some sort of global index?

(A) Not sure if I understood this one correctly. If I understand correctly, I think all instance of oStore can be considered 'persistent' as such. And when an application using ObjStore closes and starts later, all keys will be retrieved as before. As for classes, well all objects are stored as serialized objects, and can be cast into their proper classes when retrieved. (Just write back if I missed the point!)

(Q) object key generation

Supose a multi-user oStore, it is safe to let objectkey creation to the application programmer? Whai if there are two processes issuing the same objectkey to the same oStore instance. Should the keys be generated by oStore class and the application programmer just keeps an object key?

(A) Well, its a good point, and if the IDs are created by oStore, it wouldn't pose a problem. Also, if we must consider that multiple 'front-ends' can be developed, one for automatic id creation, another for object-key use. And if processes would write to the same key, well, we can develop some type of 'contention rule module' to deal with the matter.

(Q) Is there anyway to flush said cache on exit? specifically without having to manually call a flush() member? DigiGod

(A) Well, since that version of oStore (on SourceServer), I've realised you can use the method:

Runtime.runFinalizersOnExit(true);

and then implement a

finalize()

method to automatically close. I do need to post the latest versions of source code, not to cause too much confusion.

(Q) It think that the synchronization is indeed desired. Without it, multiple processes/threads couldn't safely read/write the same "object" (represented by a Mapping if I got it right) simultaneously, which is IMO not a good thing, as that kind of functionality is probably going to be needed.

However, why not consider doing the synchronization at the Mapping level; that is, to allow simultaneous reading and writing to a particular media, but not to the same Object/Mapping? It would mean more overhead, but I think it would be beneficial in the long run. If an application stores a 1 meg chunk of data on disk, that would disable reading from that media for quite some time. But maybe you didn't intend oStore to be used on large amounts of data? PeterSchuller

(A) It certainly is a limitation, and I wouldn't want to stop the use of large amounts of data over that. We could always rename this class SafeStorageIO and the non-synchronised version DefaultStorageIO - what you think ? Seems the best solution in my eyes. IMO the StorageTracker (as Mapping is just a pointer really) could still implement a system, where the object being written to media, if given a read request is read from memory.

(Q) Is oStore a file system?

(A) Nope - its a primitive hybrid of a relative and OO database.

Created: 5-Jul-98 StefanBorg

Edited :19-Jul-98 StefanBorg

Go back to oStore

Content of these pages are owned and copyrighted by the poster.
Hosted by: