What is event sourcing?
Storing all the changes (events) to the system, rather than just its current state.
Why haven't I heard of event stores before?
You have. Almost all transactional RDBMS systems use a transactional log for storing all changes applied to the database. In a pinch, the current state of the database can be recreated from this transaction log. This is a kind of event store. Event sourcing just means following this idea to its conclusion and using such a log as the primary source of data.
What are some advantages of event sourcing?
- Ability to put the system in any prior state. Useful for debugging. (I.e. what did the system look like last week?)
- Having a true history of the system. Gives further benefits such as audit and traceability. In some fields this is required by law.
- We mitigate the negative effects of not being able to predict future needs, by storing all events and being able to create arbitrary read-side projections as needed. This allows for more nimble responses to new requirements.
- The kind of operations made on an event store is very limited, making the persistence very predictable and thus easing testing.
- Event stores are conceptually simpler than full RDBMS solutions, and it's easy to scale up from an in-memory list of events to a full-featured event store.
Is event sourcing a requirement to do CQRS?
No. You can save your aggregates in any form you like. However, event sourcing works well with CQRS, and brings a number of additional benefits.
What if an event in the event queue turns out to be wrong?
In an event queue, new events are added to the end of the queue. Events are never removed or changed. (Just as in an accountant's ledger, incidentally.) Compensating actions are what you can add in order to correct actual mistakes. They are simply events which cancel out earlier events.
Won't the use of event sourcing make my system slow?
It takes more time to apply events to build up the current state. But processors are really fast; applying events takes on the order of microseconds. For most domains, performance isn't a problem.
Furthermore, the tight aggregate boundaries that come hand in hand with event sourcing should lead to systems that will scale well horizontally.
What is snapshotting?
An optimization where a snapshot of the aggregate's state is also saved (conceptually) in the event queue every so often, so that event application can start from the snapshot instead of from scratch. This can speed things up. Snapshots can always be discarded or re-created as needed, since they represent computed information from the event stream.
Typically, a background process, separate from the regular task of persisting events, takes care of creating snapshots.
Snapshotting has a number of drawbacks related to re-introducing current state in the database. Rather than assume you will need it, start without snapshotting, and add it only after profiling shows you that it will help.
How do I version/upgrade my events?
You leave them as-is in the event-store, because it is conceptually an append-only list. However, both write side and read side can "upgrade" incoming events in their handlers. An event can always be upgraded to a newer version... if not, it was probably not a newer version after all, but a completely different event type.
How do I handle a growing/large event store over time?
Events are usually quite small, and you can easily store, index, and search millions of them on a low-end relational database.
That said, it's always good to plan ahead, and pick a serialization format that serves you well in terms of size. JSON tends to be smaller than the corresponding XML, for example.
If you feel the need to algorithmically compress your events, that's also an option. Google's protocol-buffers are a modern example of a compressed serialization to use.
For the cases where you actually literally run out of hard drive space: disks are cheap nowadays. Consider saving historical events in some permanent storage. The events carry important business value; do not throw them away.
If the event store outgrows a single machine, then it is easy to shard first by aggregate type, and with a little content-based routing even at the level of aggregates themselves.
Could I persist commands, too?
It's often useful to log your commands, because they contain important information about the requests made on the domain model.
But commands are not events, and they don't belong in the event store. Simply consider logging of the commands as an additional aspect to be wrapped around your command handlers.