[mysql] Is there a MySQL option/feature to track history of changes to records?

I've been asked if I can keep track of the changes to the records in a MySQL database. So when a field has been changed, the old vs new is available and the date this took place. Is there a feature or common technique to do this?

If so, I was thinking of doing something like this. Create a table called changes. It would contain the same fields as the master table but prefixed with old and new, but only for those fields which were actually changed and a TIMESTAMP for it. It would be indexed with an ID. This way, a SELECT report could be run to show the history of each record. Is this a good method? Thanks!

This question is related to mysql database

The answer is


Here is how we solved it

a Users table looked like this

Users
-------------------------------------------------
id | name | address | phone | email | created_on | updated_on

And the business requirement changed and we were in a need to check all previous addresses and phone numbers a user ever had. new schema looks like this

Users (the data that won't change over time)
-------------
id | name

UserData (the data that can change over time and needs to be tracked)
-------------------------------------------------
id | id_user | revision | city | address | phone | email | created_on
 1 |   1     |    0     | NY   | lake st | 9809  | @long | 2015-10-24 10:24:20
 2 |   1     |    2     | Tokyo| lake st | 9809  | @long | 2015-10-24 10:24:20
 3 |   1     |    3     | Sdny | lake st | 9809  | @long | 2015-10-24 10:24:20
 4 |   2     |    0     | Ankr | lake st | 9809  | @long | 2015-10-24 10:24:20
 5 |   2     |    1     | Lond | lake st | 9809  | @long | 2015-10-24 10:24:20

To find the current address of any user, we search for UserData with revision DESC and LIMIT 1

To get the address of a user between a certain period of time we can use created_on bewteen (date1 , date 2)


Just my 2 cents. I would create a solution which records exactly what changed, very similar to transient's solution.

My ChangesTable would simple be:

DateTime | WhoChanged | TableName | Action | ID |FieldName | OldValue

1) When an entire row is changed in the main table, lots of entries will go into this table, BUT that is very unlikely, so not a big problem (people are usually only changing one thing) 2) OldVaue (and NewValue if you want) have to be some sort of epic "anytype" since it could be any data, there might be a way to do this with RAW types or just using JSON strings to convert in and out.

Minimum data usage, stores everything you need and can be used for all tables at once. I'm researching this myself right now, but this might end up being the way I go.

For Create and Delete, just the row ID, no fields needed. On delete a flag on the main table (active?) would be good.


You could create triggers to solve this. Here is a tutorial to do so (archived link).

Setting constraints and rules in the database is better than writing special code to handle the same task since it will prevent another developer from writing a different query that bypasses all of the special code and could leave your database with poor data integrity.

For a long time I was copying info to another table using a script since MySQL didn’t support triggers at the time. I have now found this trigger to be more effective at keeping track of everything.

This trigger will copy an old value to a history table if it is changed when someone edits a row. Editor ID and last mod are stored in the original table every time someone edits that row; the time corresponds to when it was changed to its current form.

DROP TRIGGER IF EXISTS history_trigger $$

CREATE TRIGGER history_trigger
BEFORE UPDATE ON clients
    FOR EACH ROW
    BEGIN
        IF OLD.first_name != NEW.first_name
        THEN
                INSERT INTO history_clients
                    (
                        client_id    ,
                        col          ,
                        value        ,
                        user_id      ,
                        edit_time
                    )
                    VALUES
                    (
                        NEW.client_id,
                        'first_name',
                        NEW.first_name,
                        NEW.editor_id,
                        NEW.last_mod
                    );
        END IF;

        IF OLD.last_name != NEW.last_name
        THEN
                INSERT INTO history_clients
                    (
                        client_id    ,
                        col          ,
                        value        ,
                        user_id      ,
                        edit_time
                    )
                    VALUES
                    (
                        NEW.client_id,
                        'last_name',
                        NEW.last_name,
                        NEW.editor_id,
                        NEW.last_mod
                    );
        END IF;

    END;
$$

Another solution would be to keep an Revision field and update this field on save. You could decide that the max is the newest revision, or that 0 is the most recent row. That's up to you.


Here's a straightforward way to do this:

First, create a history table for each data table you want to track (example query below). This table will have an entry for each insert, update, and delete query performed on each row in the data table.

The structure of the history table will be the same as the data table it tracks except for three additional columns: a column to store the operation that occured (let's call it 'action'), the date and time of the operation, and a column to store a sequence number ('revision'), which increments per operation and is grouped by the primary key column of the data table.

To do this sequencing behavior a two column (composite) index is created on the primary key column and revision column. Note that you can only do sequencing in this fashion if the engine used by the history table is MyISAM (See 'MyISAM Notes' on this page)

The history table is fairly easy to create. In the ALTER TABLE query below (and in the trigger queries below that), replace 'primary_key_column' with the actual name of that column in your data table.

CREATE TABLE MyDB.data_history LIKE MyDB.data;

ALTER TABLE MyDB.data_history MODIFY COLUMN primary_key_column int(11) NOT NULL, 
   DROP PRIMARY KEY, ENGINE = MyISAM, ADD action VARCHAR(8) DEFAULT 'insert' FIRST, 
   ADD revision INT(6) NOT NULL AUTO_INCREMENT AFTER action,
   ADD dt_datetime DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP AFTER revision,
   ADD PRIMARY KEY (primary_key_column, revision);

And then you create the triggers:

DROP TRIGGER IF EXISTS MyDB.data__ai;
DROP TRIGGER IF EXISTS MyDB.data__au;
DROP TRIGGER IF EXISTS MyDB.data__bd;

CREATE TRIGGER MyDB.data__ai AFTER INSERT ON MyDB.data FOR EACH ROW
    INSERT INTO MyDB.data_history SELECT 'insert', NULL, NOW(), d.* 
    FROM MyDB.data AS d WHERE d.primary_key_column = NEW.primary_key_column;

CREATE TRIGGER MyDB.data__au AFTER UPDATE ON MyDB.data FOR EACH ROW
    INSERT INTO MyDB.data_history SELECT 'update', NULL, NOW(), d.*
    FROM MyDB.data AS d WHERE d.primary_key_column = NEW.primary_key_column;

CREATE TRIGGER MyDB.data__bd BEFORE DELETE ON MyDB.data FOR EACH ROW
    INSERT INTO MyDB.data_history SELECT 'delete', NULL, NOW(), d.* 
    FROM MyDB.data AS d WHERE d.primary_key_column = OLD.primary_key_column;

And you're done. Now, all the inserts, updates and deletes in 'MyDb.data' will be recorded in 'MyDb.data_history', giving you a history table like this (minus the contrived 'data_columns' column)

ID    revision   action    data columns..
1     1         'insert'   ....          initial entry for row where ID = 1
1     2         'update'   ....          changes made to row where ID = 1
2     1         'insert'   ....          initial entry, ID = 2
3     1         'insert'   ....          initial entry, ID = 3 
1     3         'update'   ....          more changes made to row where ID = 1
3     2         'update'   ....          changes made to row where ID = 3
2     2         'delete'   ....          deletion of row where ID = 2 

To display the changes for a given column or columns from update to update, you'll need to join the history table to itself on the primary key and sequence columns. You could create a view for this purpose, for example:

CREATE VIEW data_history_changes AS 
   SELECT t2.dt_datetime, t2.action, t1.primary_key_column as 'row id', 
   IF(t1.a_column = t2.a_column, t1.a_column, CONCAT(t1.a_column, " to ", t2.a_column)) as a_column
   FROM MyDB.data_history as t1 INNER join MyDB.data_history as t2 on t1.primary_key_column = t2.primary_key_column 
   WHERE (t1.revision = 1 AND t2.revision = 1) OR t2.revision = t1.revision+1
   ORDER BY t1.primary_key_column ASC, t2.revision ASC

Edit: Oh wow, people like my history table thing from 6 years ago :P

My implementation of it is still humming along, getting bigger and more unwieldy, I would assume. I wrote views and pretty nice UI to look at the history in this database, but I don't think it was ever used much. So it goes.

To address some comments in no particular order:

  • I did my own implementation in PHP that was a little more involved, and avoided some of the problems described in comments (having indexes transferred over, signifcantly. If you transfer over unique indexes to the history table, things will break. There are solutions for this in the comments). Following this post to the letter could be an adventure, depending on how established your database is.

  • If the relationship between the primary key and the revision column seems off it usually means the composite key is borked somehow. On a few rare occasions I had this happen and was at a loss to the cause.

  • I found this solution to be pretty performant, using triggers as it does. Also, MyISAM is fast at inserts, which is all the triggers do. You can improve this further with smart indexing (or lack of...). Inserting a single row into a MyISAM table with a primary key shouldn't be an operation you need to optimize, really, unless you have significant issues going on elsewhere. In the entire time I was running the MySQL database this history table implementation was on, it was never the cause of any of the (many) performance problems that came up.

  • if you're getting repeated inserts, check your software layer for INSERT IGNORE type queries. Hrmm, can't remember now, but I think there are issues with this scheme and transactions which ultimately fail after running multiple DML actions. Something to be aware of, at least.

  • It's important that the fields in the history table and the data table match up. Or, rather, that your data table doesn't have MORE columns than the history table. Otherwise, insert/update/del queries on the data table will fail, when the inserts to the history tables put columns in the query that don't exist (due to d.* in the trigger queries), and the trigger fails. t would be awesome if MySQL had something like schema-triggers, where you could alter the history table if columns were added to the data table. Does MySQL have that now? I do React these days :P


MariaDB supports System Versioning since 10.3 which is the standard SQL feature that does exactly what you want: it stores history of table records and provides access to it via SELECT queries. MariaDB is an open-development fork of MySQL. You can find more on its System Versioning via this link:

https://mariadb.com/kb/en/library/system-versioned-tables/


Why not simply use bin log files? If the replication is set on the Mysql server, and binlog file format is set to ROW, then all the changes could be captured.

A good python library called noplay can be used. More info here.


The direct way of doing this is to create triggers on tables. Set some conditions or mapping methods. When update or delete occurs, it will insert into 'change' table automatically.

But the biggest part is what if we got lots columns and lots of table. We have to type every column's name of every table. Obviously, It's waste of time.

To handle this more gorgeously, we can create some procedures or functions to retrieve name of columns.

We can also use 3rd-part tool simply to do this. Here, I write a java program Mysql Tracker