[mysql] MySQL JOIN the most recent row only?

I have a table customer that stores a customer_id, email and reference. There is an additional table customer_data that stores a historical record of the changes made to the customer, i.e. when there's a change made a new row is inserted.

In order to display the customer information in a table, the two tables need to be joined, however only the most recent row from customer_data should be joined to the customer table.

It gets a little more complicated in that the query is paginated, so has a limit and an offset.

How can I do this with MySQL? I think I'm wanting to put a DISTINCT in there somewhere...

The query at the minute is like this-

SELECT *, CONCAT(title,' ',forename,' ',surname) AS name
FROM customer c
INNER JOIN customer_data d on c.customer_id=d.customer_id
WHERE name LIKE '%Smith%' LIMIT 10, 20

Additionaly, am I right in thinking I can use CONCAT with LIKE in this way?

(I appreciate that INNER JOIN might be the wrong type of JOIN to use. I actually have no clue what the difference is between the different JOINs. I'm going to look into that now!)

This question is related to mysql sql join

The answer is


Presuming the autoincrement column in customer_data is named Id, you can do:

SELECT CONCAT(title,' ',forename,' ',surname) AS name *
FROM customer c
    INNER JOIN customer_data d 
        ON c.customer_id=d.customer_id
WHERE name LIKE '%Smith%'
    AND d.ID = (
                Select Max(D2.Id)
                From customer_data As D2
                Where D2.customer_id = D.customer_id
                )
LIMIT 10, 20

You can also do this

SELECT    CONCAT(title, ' ', forename, ' ', surname) AS name
FROM      customer c
LEFT JOIN  (
              SELECT * FROM  customer_data ORDER BY id DESC
          ) customer_data ON (customer_data.customer_id = c.customer_id)
GROUP BY  c.customer_id          
WHERE     CONCAT(title, ' ', forename, ' ', surname) LIKE '%Smith%' 
LIMIT     10, 20;

If you are working with heavy queries, you better move the request for the latest row in the where clause. It is a lot faster and looks cleaner.

SELECT c.*,
FROM client AS c
LEFT JOIN client_calling_history AS cch ON cch.client_id = c.client_id
WHERE
   cch.cchid = (
      SELECT MAX(cchid)
      FROM client_calling_history
      WHERE client_id = c.client_id AND cal_event_id = c.cal_event_id
   )

I know this question is old, but it's got a lot of attention over the years and I think it's missing a concept which may help someone in a similar case. I'm adding it here for completeness sake.

If you cannot modify your original database schema, then a lot of good answers have been provided and solve the problem just fine.

If you can, however, modify your schema, I would advise to add a field in your customer table that holds the id of the latest customer_data record for this customer:

CREATE TABLE customer (
  id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
  current_data_id INT UNSIGNED NULL DEFAULT NULL
);

CREATE TABLE customer_data (
   id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
   customer_id INT UNSIGNED NOT NULL, 
   title VARCHAR(10) NOT NULL,
   forename VARCHAR(10) NOT NULL,
   surname VARCHAR(10) NOT NULL
);

Querying customers

Querying is as easy and fast as it can be:

SELECT c.*, d.title, d.forename, d.surname
FROM customer c
INNER JOIN customer_data d on d.id = c.current_data_id
WHERE ...;

The drawback is the extra complexity when creating or updating a customer.

Updating a customer

Whenever you want to update a customer, you insert a new record in the customer_data table, and update the customer record.

INSERT INTO customer_data (customer_id, title, forename, surname) VALUES(2, 'Mr', 'John', 'Smith');
UPDATE customer SET current_data_id = LAST_INSERT_ID() WHERE id = 2;

Creating a customer

Creating a customer is just a matter of inserting the customer entry, then running the same statements:

INSERT INTO customer () VALUES ();

SET @customer_id = LAST_INSERT_ID();
INSERT INTO customer_data (customer_id, title, forename, surname) VALUES(@customer_id, 'Mr', 'John', 'Smith');
UPDATE customer SET current_data_id = LAST_INSERT_ID() WHERE id = @customer_id;

Wrapping up

The extra complexity for creating/updating a customer might be fearsome, but it can easily be automated with triggers.

Finally, if you're using an ORM, this can be really easy to manage. The ORM can take care of inserting the values, updating the ids, and joining the two tables automatically for you.

Here is how your mutable Customer model would look like:

class Customer
{
    private int id;
    private CustomerData currentData;

    public Customer(String title, String forename, String surname)
    {
        this.update(title, forename, surname);
    }

    public void update(String title, String forename, String surname)
    {
        this.currentData = new CustomerData(this, title, forename, surname);
    }

    public String getTitle()
    {
        return this.currentData.getTitle();
    }

    public String getForename()
    {
        return this.currentData.getForename();
    }

    public String getSurname()
    {
        return this.currentData.getSurname();
    }
}

And your immutable CustomerData model, that contains only getters:

class CustomerData
{
    private int id;
    private Customer customer;
    private String title;
    private String forename;
    private String surname;

    public CustomerData(Customer customer, String title, String forename, String surname)
    {
        this.customer = customer;
        this.title    = title;
        this.forename = forename;
        this.surname  = surname;
    }

    public String getTitle()
    {
        return this.title;
    }

    public String getForename()
    {
        return this.forename;
    }

    public String getSurname()
    {
        return this.surname;
    }
}

It's a good idea that logging actual data into "customer_data" table. With this data you can select all data from "customer_data" table as you wish.


SELECT CONCAT(title,' ',forename,' ',surname) AS name * FROM customer c 
INNER JOIN customer_data d on c.id=d.customer_id WHERE name LIKE '%Smith%' 

i think you need to change c.customer_id to c.id

else update table structure


For anyone who must work with an older version of MySQL (pre-5.0 ish) you are unable to do sub-queries for this type of query. Here is the solution I was able to do and it seemed to work great.

SELECT MAX(d.id), d2.*, CONCAT(title,' ',forename,' ',surname) AS name
FROM customer AS c 
LEFT JOIN customer_data as d ON c.customer_id=d.customer_id 
LEFT JOIN customer_data as d2 ON d.id=d2.id
WHERE CONCAT(title, ' ', forename, ' ', surname) LIKE '%Smith%'
GROUP BY c.customer_id LIMIT 10, 20;

Essentially this is finding the max id of your data table joining it to the customer then joining the data table to the max id found. The reason for this is because selecting the max of a group doesn't guarantee that the rest of the data matches with the id unless you join it back onto itself.

I haven't tested this on newer versions of MySQL but it works on 4.0.30.


Examples related to mysql

Implement specialization in ER diagram How to post query parameters with Axios? PHP with MySQL 8.0+ error: The server requested authentication method unknown to the client Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver' phpMyAdmin - Error > Incorrect format parameter? Authentication plugin 'caching_sha2_password' is not supported How to resolve Unable to load authentication plugin 'caching_sha2_password' issue Connection Java-MySql : Public Key Retrieval is not allowed How to grant all privileges to root user in MySQL 8.0 MySQL 8.0 - Client does not support authentication protocol requested by server; consider upgrading MySQL client

Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to join

Pandas Merging 101 pandas: merge (join) two data frames on multiple columns How to use the COLLATE in a JOIN in SQL Server? How to join multiple collections with $lookup in mongodb How to join on multiple columns in Pyspark? Pandas join issue: columns overlap but no suffix specified MySQL select rows where left join is null How to return rows from left table not found in right table? Why do multiple-table joins produce duplicate rows? pandas three-way joining multiple dataframes on columns