Everything is now much better. I have built a unified memory merger that should do the trick for all the things i need. I have also created a incidental memory reader that lets a user read variable amount of memory while only locking an object once.
Right now I'm working on how each record stores data. Its complicated because data can both be very small (one byte) or very big (terabytes). The right solution is to have several ways of storing the data. One trick I'm using is the have a pointer pointing in to the data, but if the is as small or smaller then the pointer i can have a union with the value and use the memory used for the pointer to store the value. This saves me space, a level of indirection and an allocation. It may sound like a small optimization but my feeling is that the majority of values will be small enough to fit in a pointer, so optimizing for it makes sense.
When dealing with large sets of data that requires allocations it makes sense to first specify the format and layout of the memory and then fill out the data. This is however often not very convenient. Often you already have data and want to make a change to it. Let me give you an example of an issue. Imagine this data structure:
To modify this we then call:
Give the size of "array" removing members from the struct is very costly since you will need to re arrange the memory. This means that when the user asks to remove the member "distance" we don't want to do all that work. Especially we don't want to do that work since the call after removes "count" and the data has to be re arranged again. In fancy we don't need to rearrange the data at all, until we commit the data top be stored. At that point we most likely have to copy the data anyway so the re-arrange can become almost "free". This however doesn't always work. Consider these three examples:
&/* example A */
record_member_add(record, "new_thing", "array", FLOAT);
&/* example B */
record_member_add(record, "new_thing", "array", FLOAT);
record_member_set(record, "new_thing", 3.14);
&/* example C */
record_member_add(record, "new_thing", "array", FLOAT);
record_member_set(record, "new_thing", 3.14);
In A we add a new member to array, but that doesn't mean we have to re arrange the memory until committing. In example B, we create a new member and then store the data in it. This means that there needs to be space available for if, so record_member_set will triggers a re-arrange. In example C we add we first remove a member, then add a new value. Given that the old value took up more space then the old value we are able to use that space to store the value given in record_member_set, but we can avoid a re arrange until commit. All this obviously makes everything very complex... But at least I'm making progress. I wonder if other databases do these kinds of things... Ive heard horror stories but I don't really know how much of it is true. I have a feeling they dont pack the data as densely as i do.
I have tried for about a week to design and implement my records, and I have come to the conclusion that it cant be done right with the code i have, so I'm about to throw away almost all my work and start over. It sucks.
Here is the problem: I want the records to work like this; you have an existing record, and you retrieve it from storage, you make a bunch of changes to the record, and then you apply the record to your storage. Easy right? No not at all if the record is big. Let me give you an example, lets assume the record is 10gigs large and the change you want to make is one byte. First of all just loading in the entire record in to ram is not feasible or at least not effective, Then writing all 10 gigs just for the change of one byte is clearly not effective either. Now lets assume that the user gets the record makes a change and then cancels the operation by not saving the change, then a lot of work has been done for no reason. OK So the solution would be to just store the changes, and then apply them to the record, when the user actually applies them, right? Yes, but that is not enough, What if the user adds 10gigs to a record? Storing all those changes in ram is not good enough. You will need to store them in the storage system that can put things on disc when need be. OK so THEN we have a good enough system right? No, once the have added the 10 gigs of data to the file storage and you want to use that data to create the new version of the record, you don't want to load them from disk in to ram so that you can put them back to disk. You want to use the fact that they are already on disk and make the new record just point to the data that has already been committed to disk.
This breaks my storage system. So i have to rewrite it. It sucks but that's the way it is. I'm working on specyfying what the new system will look like.
I'm working on the data format and 2 major issues have been occupying my mind. The first deals with communicating changes to the internal data format. If you have a record you can request the current version of the record. For most applications this is all you really need, but in some cases you want to subscribe to a record and get notifications if it changes. Now the question comes, how do you describe changes? in the data store changes are described as inserts. Each insert contains a start point and end point, a size of the insert and an array of bytes to be inserted. So far so good. With this information you can go from one document version to another. Easy right? But you get in to trouble when we get to arrays. consider this array:
The data it self is continuous so if we want to add 5 more values we can do so in one command, the problem is that we also must change the 'a' value that tells us how long the array is too, otherwise the record is in a invalid state. So this tells us we need compound changes where we need to do multiple inserts in one go. That's all doable, a bit more complex, but doable.
Then we get to more complex data structures like arrays of arrays:
Changing 'a' now is much harder since we must also change 'b' number of arrays to make the data consistent. Now the compound change needs to be a fair bit more complex. All of this can be solved by just adding more inserts and i guess that's fine, but it irks me a bit especially since inserts might no longer be the most optimal way of describing the change.
Another big issue is that an owner of a record can accept suggested changes from other users. For this to work there needs to be some form of mechanism where an application can receive the changes and then accept or deny them. The obvious way to do this would be to present the owner with two records, the old one and one with the suggested changes. Then the owner can use what ever criteria they want to establish if the change is acceptable. Its simple, but if the record is massive, say terabytes of data, creating an entirely new copy of the data and then have an application search all that data only to find a single byte has changed somewhere is hardly optimal. It would be much better if the system could describe what has changed, but then the description API becomes much more complex.
I can with lots of work parse out what the changes are from the inserts, but I still have a feeling there might be a better way. I worry a lot about; on one hand making the API very complex, and on the other hand making it slow.
The truth is I don't like my options, and that makes me think maybe there are better options out there.
Third day of thinking and I now think I have a plan. Instead of just having records be locked by a user preventing any other user from accessing them I plan on having 4 different ways of locking a record.
Read only lock.
A read only lock is trivial to implement since the underlying memory manager can has built in regression. When someone locks a version for reading, we can simply keep that regression around for as long as wee need to. A read therefor cant in anyway hinder other tasks from reading and writing to the record.
A Ownership lock gives you complete control over a record, but is dependent on access to the private key of the record. Only one thread can take ownership of the record at any one time. An ownership lock also lets the thread set the rights for the object.
A modify lock is a lock that gives you the same capabilities as the ownership lock but doesn't require the private key. The nice thing about a Modify lock is that multiple threads can lock at the same time. The system will then arbitrate the changes made by the different locks. This means that the changes can be rejected. The modifications that can be made to the data structure can be controlled by options set by a thread with an ownership lock. The owner of the record should even be able to inspect and then reject or approve each modification using some sort of callback mechanism.
A clone lock is a way to create a Ownership lock by simply creating a new record by cloning an old one. All data that is stored by check sum only is clone locked.
The nice thing is that once you have locked a record the API for reading and writing is the same no matter how you locked it. The only real change is that some actions may be rejected. The arbitration system is going to be complex to implement but i think having the ability to have multiple users collaborate in realtime with data is important enough to be worth the extra work. (Plus since I have written Protocol that can do it in the past, i cant really defend building a new protocol with less features...) Its also worth mentioning that most of thees rules only matter if you get multiple treads working on the same record, and that will probably be a rare thing.
I have spent the last two days without writing a single line of code. I'm focused all of my time trying to figure out how the API for reading and modifying records should work. I making progress but I'm still not done.
All the work i have done with the regression system helps a fair bit. If one thread is reading data it is possible to keep around that version and keep it consistent even if an other thread makes changes to the data at the same time.
In a normal concurrent system you want a thread to be able to take owner ship of a resource, make changes to it, and then release it. In Unravel this wont be enough. Owner ship is not just an issue of multiple threads fighting over data, its an issue of multiple computers on the Internet all being able to suggest changes and then having an arbitrator who controls what changes are allowed. I have written this kind of thing before with VERSE but the challenge is to make it very transparent. Verse was written for Low-Level 3D graphics people, Unravel has a much wider audience.
Something I am working hard on is trying to avoid multiple API paths. I want the API to force you to do the right thing but at the same time not over burden any application development. If features are optional then they wont be supported. If there is a slow path that is easy to implement and a fast that is harder everything will be slow. The risk is that if you add a bunch of requirements, then people may opt to write no code at all.
Progress! The memory segmentation is pretty much done. I do want to optimize the regressive data storage, and i have a feeling that at some point in the future I will want to optimize how memory is stored on disk. Im investigating the optimal way to read and write data from SSDs.
Ive started sketching out the new record API, but i will need to figure out how the data will be stored before i can start implementing stuff. I expect this part will be much easier then the memory segmentation stuff.
Today, i will just work on cleaning up the mess i have made testing my memry segmentation code...
A bit of time warp here. I ve been gone for a bit of New York and San Francisco, but now im back focusing on Unravel. In fact I now have some time pressure. I have several possible uppcomming projects where Unravels data store could be used, and I much rather use the projects as a test bed for unravel then to hack together somethjing throwaway. I dont know when or if it will happen but I want unravel up and running sooner rather then later.
First day of coding whas basicly just spent looking at the code trying to understand how it works, thats done and whats missing. Its some really complicated stuff. This morning i got in to testing and weorking on the regressive data store. I worked on it for a bit and realized that my lowest level memory manager nbeeds to support realloc. The main reason is not to avid moving data (something realloc can do using remaping the memory layout in the MMU) but rather to be able to change the sice of a data chunk while keeping the id. This means i dont have to update all the places that reference to the memory, that will simplify the code and potentialy be a big speed up. Ill get to work on that tomorrow.
Viritualizing memory has this interesting psycological efect that any access to a new memory block can potentialy be very slow. It really makes you very paranoid. The probnlem is that the preformance entierly depends on the access pattern of the user, and thats something you cant really predict.
No change. I still think its brilliant.
I have totally changed my mind. I really don't mind endianess. So first of all i will need to do a single memory copy no matter what. The reason is that i don't want an application using unravel have to care about locks, and therefor everyone gets their own copy of the memory they need. No app can hog memory and prevent other apps from accessing it.
Now to explain why I no longer mind endianess I need to explain a bit about the kind of API i want to build. Any time you have any kind of data storage you want to access you will need to write transformation code that wrangle the data on to the structures you need. To fix this i had this idea that if the data was stored in memory the same way you would store the memory in a struct you could just get a pointer to the memory, cast it to a struct and go. Since anyone can store data in Unravel any way they want you would need to have some kind of test to make sure its the right data that you are expecting. This will be needed for all kinds of things, so its not a bad thing, the problem is that it makes everything very fragile. If the data isn't stored exactly how your code expects it to be laid out, you code will need to use some slow backup path. I really dislike this because it means you have to write a fast and a slow path. Most people will just implement the slow path that only works, and then only if they really want to optimize they will pick the fast path. I know that's how people do it but i hate it.
So f you need to copy and transform data every time you read or write it, why not let the user define what it should be transformed on to? Instead of getting memory and hoping its in the right layout, why cant the application just tell the API what format they want the data in and have the transform do it for you? If it always requires a transform anyway, user transform becomes free. The code could look like this:
void read_data(UnravelRecord *record, uint count)
meomory = malloc(sizeof(MyStruct) * count);
transform = unravel_transform_create_from_code(record, "char *name;"
count = unravel_read(record, transform, memory, count);
for(i = 0; i < count; i++)
printf("name %s/n", memory[i].name);
printf("count = %u/n", memory[i].count);
printf("value = %f/n", memory[i].value);
Got nothing done. Today big / little endianess, hit me like a ton of bricks. It ruins everything. Usually i pack all networktrafic in all my protocols big endian, and that is fine but here im running in to a wall. Normaly a protocoll is something that just transfers meaning that it doesnt consern it self with how the data is accessed once it gets where it is going. Unravel is different. Unravel is designed to be usable as extendable ram. Accessing local and network data is done with one API and to truly make the network transparent, the local data access speed must be comperable to an application storing it in ram. If accessing data is slower then building your own structures in ram, users will have to start cashing data in their own data structures, and thats when the networking becomes some thing you need to work on rather then a free benefit of the memory manager. This fairly extreme preformance requirement makes everything very dificult. My first thought was to try to store all local memory in machine order. But that requires me to knwo the layout of the memory when sending it in order to be able to re-order it. Storing data in alocal way also messes up how checksums are computed (they may not be computed right away, again for preformance reasons). I also realized that you probably want to store data without padding, so some sort of local transform would be required. I now have come to the conclusion that I can probably bild very fast lookup tables that can be used when copying data to solve this. Im fairly sure it will be slightly faster on big endian then little endian machines. I will investigate further.
I have solved a lot of things. I now know how to manage my different data types and how to refference count them. A big issue was that i couldent store reference counts, together with the data being referenced since that would mean that updating the referenc count would force the underlying memory manager to swap in ther data form disc. I decided to keep my reference counter in the index instead, but that meant that it wasnt as fine grain as maybe would have liked it. I think I have strategy for solving that.
Im running in to trouble. I want Regresion to be a core part of Unravel. If you make a distributed database you want others to be able to change data, but you dont want them to be able to destry data so ypou will need versioning. If you have a large data set and you make many small changes to the data set it makes sence to not store a copy of each state but rather a change log. This complicates the data structure a bit. As long as the history stays intact its fgine, but what happens when you start refferencing time stamps? This can create branching. This creates a lot of complexity especialy when you want to prune the data set. I will need to take some time to think about how to deal with this...
Time is getting very short as I'm aproaching GDC. I really want to complete my memory management before I get interupted by other stuff. As soon as I'm done with the memory manager I'm going to get in to the data base interface. Even though this is going to become a network protocoll I'm probably going to spend months before opening a socket. Im trying to read up on how people like to access databases (and how people ruin any preformance in them by using json and http...).
I finally completed the static insert code yesterday (all 16 paths). Ive been spending the day fuzz testing and cleaning up the code. I feel like i finally have control over the code. I have loads of ideas on how to improve and optimize but at least i have something working. The big missing bit right now is the regressive memory manager, but i have an outline of that too. Fuzz testing has been fun.
I bought a Google chrome cast audio for my mother this Christmas. Its a wireless device that connects your computer to your stereo. Its essentially a wireless cable. To install it you first had to download the chrome browser, the you had to go threw a bunch of steps to connect link up. When I was done i had connected my moms stereo to... a browser. Yes the only way to play anything was if it was in the browser. Looking around to how to play my mothers iTunes library of music Google was happy to inform me they had set up an entire store for apps that she could use you buy all her music again so that she could play it wirelessly. After looking around a bit i found a blog where someone told me it was possible if i only patched the OSX kernel, and then ran someones node.js script. To do this I needed to install node.js, and in order to do so i had to download xcode, and in order to run the script i had to be super user, because administrator was not enough. In the end it turned out that OSX wouldn't let me install even as a superuser.
All this made me think of all the old businesses that Internet companies used to mock. Brick and mortar companies and rights holders who "just didn't get it". That's where the big Internet companies are right now. Sure they will deliver, but only if you exclusively use their services, their software and hand over all your data. It all works, but only if you do exactly what they intend for you to do. How fast they became more interested in their own interests then serving customers.
Do you want to know how to kill a big company? Find their profit center, and then base your competitive advantage on NOT going after that money. They may understand that their revenue will go away, but psychologically they will never be able to let go of it.
What is beautiful code? Most computer scientists seams to think its the fewest lines of code. The Linux Kernel mailing list FAQ defines bloat as the size of the executable. If that is the case I'm writing ugly code, because i consider beautiful code the one that uses the least number of cycles to be the best. Right now I'm working on the memory management and there are many different considerations as to how to store memory. You want to eliminate as many look ups and memory moves as possible and at the same time handle very large sets of data. This means that i have ended up with 4 different memory representations. On top of this all 4 representations can be regressive and that gives us 8 representations. They all need to be compatible and invisible to the APIs using the memory managers. When converting from one to another I cant simply first convert them to a common format since the only format that can store the full range of memory is also the slowest one. The end of it is that i have to implement an awful lot of paths and find the optimal way of doing everything for each one. Happily, I have at least solidified each of the formats.
So how much of the internet do i plan to replace? My plan is to be able to go a week without using the web and not notice it. Think of all web sites and sevices that dont deliver yopu something IRL. I guess thats a fair bit.
Its a curious thing to be able to overrule the world. Technology and science does that. It makes something possible or impossible no matter any law or opinion. Its a powerful thing. Why argue right and wrong, when you can just make it right?
I purposefully not writing a post today just to prove that I'm not under any obligation to write every day.
Here is what I'm thinking: Lets assume we don't want to divide data in to chunks no larger then 64 megs (26 bits) when we load them in to RAM. If each memory chunk has an address 64 bits. That means that if we make an array of addresses we can store 8 million (23 bits) addresses in to one chunk. 8 million times 64 megs adds up to 512 terabytes of memory or 49 bits of address space, 17 bits shy of a full 64 bit space. To fill a 64 bit space we need arrays of addresses to arrays of addresses, especially since Chunks will be able to be significantly smaller as you will see below.
Lets assume we have an array consisting of 3 chunks we can call A, B and C. Then we want to insert data X somewhere in the middle of this array, at a point that happens to be in B. Lets start off by saying that we do not want to reconstruct all data in to entirely new chunks. We can do this in a few different ways. One is to simply record where the insert happens and then store X. No data needs to be moved for this action. One win is that we now store regression since we both have the data before and after the insert and can construct the data at any point. This is something you may want but more on this later. The problem is that if we pile on enough changes retrieving the current (and probably most desirable) state becomes slower and slower as the genealogy takes longer and longer to reconstruct.
A different solution would be to split chunk B in to 3 parts, B1, the part, if any, that comes before the insert, B2 the bit that gets over written, if any, and B3 the bits after the insert, if any. We can now construct a list consisting of A, B1, X, B3, and C, to describe the new state. This is much faster to reconstruct. If we want to store the before state all we need to do is create a list containing A, B1, B2, B3, and C. The downside to this is obviously that over time we may fracture the data significantly. If we don't want to keep around the regression we can optimize this by merging chunks. Merging of chunks can be done in parallel with the use of the arrays. I think this is the best approach.
One issue is that all data in Unravel is referenced by checksum, and computing a checksum can be very slow if the data set is very large. If a single changed byte forces the re-computation of a new checksum on the entire data set this could very easily become an issue. This is not good enough. So I have some ideas. For real time data it should be possible to send out the changes rather then the entire data sets. In this case the checksum doesn't have to be updated that often.
Lets say Alice has significant data set A and has shared that with Bob, and she makes a minor change to it, turning it in to B, can send just the change and bob can reconstruct B. Lets assume that Carol also has data set A, but is off line when Alice makes the change. Since Alice has very little memory she has gotten rid of A and the change and now only possesses B. Carol now wants to update her A in to B. If the data set is large enough she will ask Alice to give her a list of the chunks it is made up of, so that Cecil can download it peace meal. When this list arrives Carol will notice that she already has most of the chunks, so she can reduce the downloads needed significantly.
I think this is how I will do it. Right now I'm trying to figure out what the best rules for chunk sizes ought to be.
I can now read data arrays that are around 16 exabytes large. (obviously you can store more things if you divide them in to more then one array). I can make very large arrays too, but only in the range of thousands of gigabytes, since the pointers eventually become so large that you cant keep them I'm memory realistically. I will remedy that tomorrow when i will work on being able to overtime build up large data sets rather then define them in one go. This is exhausting, and it makes my brain hurt, but I'm making progress. I know its going to get a lot worse once i start with regression. Iv been debating weather to allow insert and delete or only over write, and i have come to the conclusion that insert and delete will probably be necessary and i will get some performance gains by supporting it in the lowest level. Much like an MMU can fiddle with address space when you call a realloc, I can do the same with the chunks that make up large data sets.
You might think that if you would redesign the Internet you would make it so that everyone has a identity backed up with a cryptographic key pair. You would be wrong. You wouldn't have one key you would have thousands. First of all you are multiple people, one for your family, at least one for your job, another one for what ever kink you are in to and so on and on. Often its a good idea to keep your identities apart. If you live in some countries your life may depend on it. So yes a few identities, but thousands?
Not just people have identities so do things and software. You may not want be able to know what software produced what data and and what data it has access to. Data it self can also have identity. In Unravel all data is identified with hashes. This means that if you have a file and i have file our computers can talk to each other about that file. We can for instance reference to an article and know that we are both referencing the same article. No need for a signature for any of this, just a simple hash.
Sometimes we want to reference to the Guardians front page and we mean a very specific front page at a specific moment in time. Other times however, we want to reference the guardians front page as an evolving publication that keeps changing again an again. To do this we need a cryptographic key that keeps signing the hash of the current front page. For anything that we want to be able to change we need an identity. Most things change, so most things need an identity, so lots if identities.
Right now I'm working on this universal link system where you can either reference to a publication or a hash. This universal pointer will be 256 bytes long and contain a 64 bit length value. I also have a trick where if the data itself is smaller then the check sum, then the data can be stored directly in the check sum. This will create a natural 248 byte limit where data can be embedded in to links. I'm sure someone will find some great significance in that number that speaks to the human condition. I expect it to optimize network traffic. Speaking of optimizations, all these links are implemented as single memory allocations. This means that I cant use structs, but I have to cast in to offsets in to memory. It does give me that warm low level feel only cache coherency can give.
That was fast. Apparrently the internet doesnt like to be changed. My Internew went down today. Oh, well. Visual studio still likes me. So work continiues.
I'm starting a new project today. I'm going to redesign the Internet. I don't like the Internet as it is, so I'm redesigning it from thew ground up. I have been planing to do so for a while, but now I'm going to put down some real time in to it. I'm starting pretty low-level. I'm building a database to store all necessary data. Why transfer data if you cant store it, right?
It needs to be secure, fast, thread safe and easy to use. As I'm a C programmer I'm not really sure what kind of fluffy ease of use today's high level programmers expect, but I want direct memory access where you can get a pointer straight in to a block of memory (did I mention that I want it to be low level fast?) but I also want to be able to handle data sets way larger then can fit in to memory. Whats the best way to get a pointer in to a 16terrabyte file?
Right now I'm trying to wrap my head around what limitations I'm required to live with. I don't want to be the guy who thought no one would ever need more then 640k. I considering having a limit of 8 exabyte files. I hope that's OK. This is one of the reasons why this is an interesting project. Its easy to build something when you know what is going to be used for. I'm buildings something that that is meant create things we cant yet imagine.