[mysql] how to use a like with a join in sql?

I have 2 tables, say table A and table B and I want to perform a join, but the matching condition has to be where a column from A 'is like' a column from B meaning that anything can come before or after the column in B:

for example: if the column in A is 'foo'. Then the join would match if column in B is either: 'fooblah', 'somethingfooblah', or just 'foo'. I know how to use the wildcards in a standard like statement, but am confused when doing a join. Does this make sense? Thanks.

This question is related to mysql sql join sql-like

The answer is


In MySQL you could try:

SELECT * FROM A INNER JOIN B ON B.MYCOL LIKE CONCAT('%', A.MYCOL, '%');

Of course this would be a massively inefficient query because it would do a full table scan.

Update: Here's a proof


create table A (MYCOL varchar(255));
create table B (MYCOL varchar(255));
insert into A (MYCOL) values ('foo'), ('bar'), ('baz');
insert into B (MYCOL) values ('fooblah'), ('somethingfooblah'), ('foo');
insert into B (MYCOL) values ('barblah'), ('somethingbarblah'), ('bar');
SELECT * FROM A INNER JOIN B ON B.MYCOL LIKE CONCAT('%', A.MYCOL, '%');
+-------+------------------+
| MYCOL | MYCOL            |
+-------+------------------+
| foo   | fooblah          |
| foo   | somethingfooblah |
| foo   | foo              |
| bar   | barblah          |
| bar   | somethingbarblah |
| bar   | bar              |
+-------+------------------+
6 rows in set (0.38 sec)

Using conditional criteria in a join is definitely different than the Where clause. The cardinality between the tables can create differences between Joins and Where clauses.

For example, using a Like condition in an Outer Join will keep all records in the first table listed in the join. Using the same condition in the Where clause will implicitly change the join to an Inner join. The record has to generally be present in both tables to accomplish the conditional comparison in the Where clause.

I generally use the style given in one of the prior answers.

tbl_A as ta
    LEFT OUTER JOIN tbl_B AS tb
            ON ta.[Desc] LIKE '%' + tb.[Desc] + '%'

This way I can control the join type.


If this is something you'll need to do often...then you may want to denormalize the relationship between tables A and B.

For example, on insert to table B, you could write zero or more entries to a juncion table mapping B to A based on partial mapping. Similarly, changes to either table could update this association.

This all depends on how frequently tables A and B are modified. If they are fairly static, then taking a hit on INSERT is less painful then repeated hits on SELECT.


Using INSTR:

SELECT *
  FROM TABLE a
  JOIN TABLE b ON INSTR(b.column, a.column) > 0

Using LIKE:

SELECT *
  FROM TABLE a
  JOIN TABLE b ON b.column LIKE '%'+ a.column +'%'

Using LIKE, with CONCAT:

SELECT *
  FROM TABLE a
  JOIN TABLE b ON b.column LIKE CONCAT('%', a.column ,'%')

Mind that in all options, you'll probably want to drive the column values to uppercase BEFORE comparing to ensure you are getting matches without concern for case sensitivity:

SELECT *
  FROM (SELECT UPPER(a.column) 'ua'
         TABLE a) a
  JOIN (SELECT UPPER(b.column) 'ub'
         TABLE b) b ON INSTR(b.ub, a.ua) > 0

The most efficient will depend ultimately on the EXPLAIN plan output.

JOIN clauses are identical to writing WHERE clauses. The JOIN syntax is also referred to as ANSI JOINs because they were standardized. Non-ANSI JOINs look like:

SELECT *
  FROM TABLE a,
       TABLE b
 WHERE INSTR(b.column, a.column) > 0

I'm not going to bother with a Non-ANSI LEFT JOIN example. The benefit of the ANSI JOIN syntax is that it separates what is joining tables together from what is actually happening in the WHERE clause.


When writing queries with our server LIKE or INSTR (or CHARINDEX in T-SQL) takes too long, so we use LEFT like in the following structure:

select *
from little
left join big
on left( big.key, len(little.key) ) = little.key

I understand that might only work with varying endings to the query, unlike other suggestions with '%' + b + '%', but is enough and much faster if you only need b+'%'.

Another way to optimize it for speed (but not memory) is to create a column in "little" that is "len(little.key)" as "lenkey" and user that instead in the query above.


Examples related to mysql

Implement specialization in ER diagram How to post query parameters with Axios? PHP with MySQL 8.0+ error: The server requested authentication method unknown to the client Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver' phpMyAdmin - Error > Incorrect format parameter? Authentication plugin 'caching_sha2_password' is not supported How to resolve Unable to load authentication plugin 'caching_sha2_password' issue Connection Java-MySql : Public Key Retrieval is not allowed How to grant all privileges to root user in MySQL 8.0 MySQL 8.0 - Client does not support authentication protocol requested by server; consider upgrading MySQL client

Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to join

Pandas Merging 101 pandas: merge (join) two data frames on multiple columns How to use the COLLATE in a JOIN in SQL Server? How to join multiple collections with $lookup in mongodb How to join on multiple columns in Pyspark? Pandas join issue: columns overlap but no suffix specified MySQL select rows where left join is null How to return rows from left table not found in right table? Why do multiple-table joins produce duplicate rows? pandas three-way joining multiple dataframes on columns

Examples related to sql-like

SQL Server: use CASE with LIKE Create hive table using "as select" or "like" and also specify delimiter How do I find ' % ' with the LIKE operator in SQL Server? Using LIKE operator with stored procedure parameters SQL- Ignore case while searching for a string Is the LIKE operator case-sensitive with MSSQL Server? Using Eloquent ORM in Laravel to perform search of database using LIKE SQL 'LIKE' query using '%' where the search criteria contains '%' How to use "like" and "not like" in SQL MSAccess for the same field? MySQL SELECT LIKE or REGEXP to match multiple words in one record