A few weeks back I did an article on C++Builder where I went completely outside my comfort zone and implemented my very first C++ class.
Since I have gotten a taste for C++Builder my adventures with C/C++ is going to continue over the next weeks and months, but like so many others, the Corona virus has forced me to prioritize things differently with regards to work. So for the time being my focus will be Delphi.
In this Delphi article I want to talk about vector containers, which is a concept from C++ that might be unfamiliar to Delphi developers. When you read the word “vector” you probably think about 3D graphics and matrix mathematics; but vector containers are something else entirely.
Vector containers are, simply put, a generics based class that simplify how you work with lists, arrays, queues and data that can be represented as a “sequence of values”. This might not sound too exciting, but keep reading and you might be surprised.
I must admit that the term “vector” first seemed wildly out of place for such a fundamental and down to earth topic; but if you Google the meaning of the word, vector simply means direction or heading. And a pointer is, for all means and purposes, a heading of sorts.
Vectors in Delphi
While Object Pascal and C++ are practically identical in every way in terms of functionality, there are some aspects where the languages and ecosystems diverge; radically in some places. Delphi might not have the exact same reportoare as the C++ standard library, but it has nonetheless always offered us good alternatives to achieve the same. Delphi and C++Builder share common ground in their support of the VCL and FMX frameworks, but the vector class is not a part of the VCL or FMX, but rather the C++ standard-library (std::vector). As such, Delphi doesn’t really have anything that is a direct match.
Since I have found vector containers so useful in C++ I figured – why not create a vector implementation for Delphi? And that is exactly what I did. So in this article I will walk you through why vector lists are useful, how to use them – and last but not least, the full source-code that you can copy and use in your own projects (completely free and open source, no strings attached).
Note: a BitBucket repository link is provided at the end of this article.
Containers, what’s the Big Deal?
Like mentioned, a vector class is just a container; a container that deals with “sequences of data”, which in most cases means a list or an array. The reason the vector class is called a container, is because it does not involve itself with how the items are allocated or stored, but leaves that to a second class called an allocator.
A helpful way to think about the vector class (meant purely as a synergy or parallel), is to think about TDataset. TDataset abstract away where data comes from and how it’s organized behind the scenes. Just like TDataset, the vector class provides the means of navigating a sequence of data, but it doesn’t involve itself with the details.
Since vector lists are used practically everywhere in the C++ ecosystem, it helps reduce the amount of code you need to write. The same code that works on an array will work for a list. Developers can also write their own allocator instances, storing data on disk or mapping database records directly to C structures. As long as someone has implemented an allocator for that particular scenario, the vector container can represent it. Being able to roll your own opens up for some interesting possibilities.
Out of the box the C++ standard library provides two allocator classes which both represent in-memory lists that can be used with typed data. There is also a third allocator type like I mentioned above (or technically, the base-class that allocators inherit from), which simply means you provide your own. To sum up:
Let’s have a look at each of them and see what the difference is.
Sequenced means that your list of items is allocated as a single, continuous block of memory. One where the items are neatly arranged one after the other. The closest thing in Delphi would be an array of packed records. This allocator has the benefit of being exceptionally fast for reading values, because it can use simple mathematics to calculate the offset for each item.
The downside is that when populating such a vector list, memory must be re-allocated quite often. In my humble implementation I did not take the time to implement a cache mechanism, which means that every time you add, insert or remove an item, the whole memory segment holding the list is re-allocated (something I will add in the next revision).
The C++ specification though, operates with a proper cache, allocating more items than it actually uses, thus keeping reallocation to a minimum. The same system of capacity and use is common for Delphi lists as well.
Another benefit for the sequenced allocator, is that the whole content can be quickly dumped to disk or duplicated in memory. All the items are after all maintained in a single chunk of memory, making sizes and offsets predictable (so you can write the data to a TFileStream in a single operation). Assignment between two vector lists (with the same allocator type) is likewise efficient.
So while ordinary programming tasks might not immediately benefit from sequenced storage, there are a few low-level scenarios where vector containers can do wonders.
As a curiosity I can mention that on platforms like Nintendo DS, the display is actually a file. So graphics are drawn by writing values to a file on the filesystem. C++ developers solved this rare and obscure obstacle quite elegantly, by using a Vector container with a custom allocator that mapped the file into memory; That way the pixels could be read and written much like an array.
Note: I’m not suggesting that Delphi can target Nintendo DS; I’m simply pointing out how flexible and powerful vector containers are when faced with unusual data representations.
Dynamic is more or less identical to how an ordinary Delphi array stores information.
This allocator type has no criteria on how items are stored in memory. As long as the allocator can deliver each item, the vector container could not care less about the details.
This is the exact opposite of the sequenced model, which demands that items are stored in a predictable fashion.
The closest match to a dynamic allocator in Delphi, would be a normal TArray<>.
If vector containers just operated with sequential or dynamic allocators, their usefulness would be insignificant compared to ordinary lists or collections. The power of vector containers is that they allow anyone to implement their own allocators, making it possible to work with data sequences from almost any source or medium without breaking the interface. This is where the time-saving feature comes in.
If something can be represented as a list or sequence of values, an allocator can be written for it; and consequently – the data can be accessed through the common vector interface.
Here are some suggestions to allocators that could be implemented. It is ultimately up to each individual developer to write an allocator class that makes sense for their particular situation:
- An allocator that maps database records to Pascal records
- An allocator that operates with disk based arrays
- An allocator that works with JSON or XML nodes
- An allocator that uses memory mapping
- An allocator that wraps directory listings
- An allocator that wraps text-files
Needless to say, the moment you add Delphi’s powerful RTTI into the equation, things get interesting. A vector list that maps database fields to a record would be fairly simple to create. As far as ORM (object relational mapping) goes, avoiding class instances and mapping directly to Pascal record types would be both faster and more memory efficient. So there are a few unexplored and fascinating opportunities in this material.
A Practical Example
Before we look at the actual vector class implementation, let’s have a look at how you use it. Since the sequential allocator is perfect for maintaining records within a single block of memory (which can be very powerful when dealing with low-level tasks) let’s start with that.
type TTest = record id: integer; name: shortstring; class function Create(id: integer; name: shortstring): TTest; static; end; TMyVectorList = class(TVector<TTest>) end;
In the code above we define a simple record structure with a couple of fields. To make life easier we also define a function to initialize a record, that way we don’t clutter our source with too much boilerplate code. The implementation for the helper function is straight forward:
class function TTest.Create(id: integer; name: shortstring): TTest; begin result.id := id; result.name := name; end;
You will notice that we also defined a class, TMyVectorList, which simply inherits from TVector with our record type. We don’t really need to do this. You could just create an instance of TVector<TTest>, but if you want to expand the vector container with some utility functionality down the line, isolating it like this makes the code easier to read and work with.
Next, switch to the form designer and drop a TButton and a TListbox on the form. While not important for our example, you might want to set the anchors for the listbox to left, top, right and bottom so it resizes with the form.
With our form design done, double-click on the button and fill out the following code:
procedure TForm1.Button1Click(Sender: TObject); var x: integer; lList: TMyVectorList; begin lList := TMyVectorList.Create(vaSequence); try // populate for x := 1 to 10 do begin lList.add( TTest.Create(x, 'name #' + IntToStr(x-1) ) ); end; // do some inserts lList.insert(4, TTest.Create(4, 'first') ); lList.insert(9, TTest.Create(9, 'second') ); // Read back for x := 0 to lList.count -1 do begin listbox1.items.add( lList.Items[x].name ); end; finally lList.Clear; lList.free; end; end;
Hit F9 (compile and run) and click the populate button, the result should be:
What are the Benefits?
If you are wondering why the above is any different from using a standard TArray or TList to achieve the same, you have to remember how the allocator works. The difference here is not what it does, but rather how it does it. You must factor in that the power of vector lists is not it’s ability to house a list of records, but rather that it provides a unified interface for accessing values sequentially – regardless of origin or format.
TVector has no knowledge of how or where the list data comes from. It only knows the generic type (TTest in our example) and the allocator used to manage the list. Beyond that, TVector is nothing more than a unified access point.
In the example above we used the sequential allocator, so the list items are stored within a single block of memory.
To make sure that the memory integrity is coherent, let’s double check! Let’s use a pointer and access the memory directly old-school style!
Accessing the Raw Data
In our next example we are going to access the raw memory that our vector contains (or more accurately, that the allocator provides and the vector expose). My humble vector implementation keeps the container and allocator conveniently decoupled, bound only by an interface reference. But that interface exposed methods that allows us to tap into the data (when applicable of course).
The methods in question are:
- LockMemory(var data)
I should underline that, when you implement your own vector allocator classes, you should throw an exception if memory access is not possible or allowed. I was very reluctant to include LockMemory() and UnLockMemory() at all, since the entire point of a container is to abstract away knowledge of the content. But truth be told, there is something about being practical about all of this too.
So switch back to the form designer and let’s add a second button like this:
Next, we define a pointer variation of our record type:
PTest = ^TTest; TTest = record id: integer; name: shortstring; class function Create(id: integer; name: shortstring): TTest; static; end;
With a pointer type for our record defined, we can access records directly from a memory address. Now double-click on the new button and fill out the following code:
procedure TForm1.Button2Click(Sender: TObject); var x: integer; lList: TMyVectorList; lRaw: PTest; begin lList := TMyVectorList.Create(vaSequence); try // populate for x := 1 to 10 do begin lList.add( TTest.Create(x, 'name #' + IntToStr(x-1) ) ); end; // Get a pointer to the managed memory lRaw := nil; llist.Memory.LockMemory(lRaw); try // manually read out the name field from each for x := 0 to lList.Count-1 do begin listbox1.items.add( lRaw^.Name ); inc(lRaw); end; finally lList.Memory.UnLockMemory; end; finally lList.Clear; lList.free; end; end;
Hit F9 (compile and run), click the new button, and you should see the following result:
As you can see, the sequenced allocator maintains data integrity and delivers predictable results.
The original C++ vector standard class supports enumerators out of the box, but the way that enumerators are defined in the C++ standard library is not compatible with Delphi. So instead of implementing a enumerator pattern incompatible with Delphi, I naturally went for the traditional Delphi enumerator system instead. It would be cool to be 100% compatible with the C++ vector lists, but there is a fine line between compatible and usable. I’m always on the practical side.
So the TVector class inherits from TEnumerable so you can access the list in handy, modern fashion like:
var el: TTest; for el in lVectorList do begin // do something with el end;
With the vector list inheriting from TEnumerable, you might wonder why I would trade away the benefits of TPersistence in favour of enumerators? The vector list implements a standard Assign() method, so you can assign the content between vector containers. It will also check if the allocator models are sequential – and perform a fast resize and move of the entire memory block if possible.
I also made sure that the allocator base-class inherits from TInterfacedPersistent, so there is plenty of room for improvements on this front later. I will leave that part up to you, the reader.
The Vector Class Unit
I think we have covered more than enough theory and use-cases by now, so it’s time to check out the code. Literally.
Since the code would cover some 20 A4 pages, I took the liberty of setting up a repository on BitBucket for it. The code is released under MPL v2 (Mozilla Public License) so you can safely use it in your own projects.
You can download or fork the repository here:
Note: Version 1.0.1 is due out this weekend (monday at the latest), and the new features are breathtaking! Buffer support (untyped memory system), typed Views<> (accessing untyped memory as a typed array) and vector containers. Combined these make mincemeat out of some very complex tasks! The buffer system unifies for both memory and file storage, so that vector content can be used directly from disk (much like a database). You can also implement your own buffers (why not make a version that reads from your Amazon or Azure storage?).
Finally Delphi developers can enjoy the same ease and power as C++ developers have done for years. The new version even surpasses the C++ standard library by introducing seamless memory and file storage, mapped memory and even support for TBit in typed views (so you can traverse a buffer as an array of bits!).
But enjoy the tutorial version referenced above. It was made to help you get started. When you are ready, grab the code from the “latest” sub-folder and play with the really powerful stuff.