[sql] Difference between INNER JOIN and LEFT SEMI JOIN

Suppose there are 2 tables TableA and TableB with only 2 columns (Id, Data) and following data:

TableA:

+----+---------+
| Id |  Data   |
+----+---------+
|  1 | DataA11 |
|  1 | DataA12 |
|  1 | DataA13 |
|  2 | DataA21 |
|  3 | DataA31 |
+----+---------+

TableB:

+----+---------+
| Id |  Data   |
+----+---------+
|  1 | DataB11 |
|  2 | DataB21 |
|  2 | DataB22 |
|  2 | DataB23 |
|  4 | DataB41 |
+----+---------+

Inner Join on column Id will return columns from both the tables and only the matching records:

.----.---------.----.---------.
| Id |  Data   | Id |  Data   |
:----+---------+----+---------:
|  1 | DataA11 |  1 | DataB11 |
:----+---------+----+---------:
|  1 | DataA12 |  1 | DataB11 |
:----+---------+----+---------:
|  1 | DataA13 |  1 | DataB11 |
:----+---------+----+---------:
|  2 | DataA21 |  2 | DataB21 |
:----+---------+----+---------:
|  2 | DataA21 |  2 | DataB22 |
:----+---------+----+---------:
|  2 | DataA21 |  2 | DataB23 |
'----'---------'----'---------'

Left Join (or Left Outer join) on column Id will return columns from both the tables and matching records with records from left table (Null values from right table):

.----.---------.----.---------.
| Id |  Data   | Id |  Data   |
:----+---------+----+---------:
|  1 | DataA11 |  1 | DataB11 |
:----+---------+----+---------:
|  1 | DataA12 |  1 | DataB11 |
:----+---------+----+---------:
|  1 | DataA13 |  1 | DataB11 |
:----+---------+----+---------:
|  2 | DataA21 |  2 | DataB21 |
:----+---------+----+---------:
|  2 | DataA21 |  2 | DataB22 |
:----+---------+----+---------:
|  2 | DataA21 |  2 | DataB23 |
:----+---------+----+---------:
|  3 | DataA31 |    |         |
'----'---------'----'---------'

Right Join (or Right Outer join) on column Id will return columns from both the tables and matching records with records from right table (Null values from left table):

+-----------------------------+
¦ Id ¦  Data   ¦ Id ¦  Data   ¦
+----+---------+----+---------¦
¦  1 ¦ DataA11 ¦  1 ¦ DataB11 ¦
¦  1 ¦ DataA12 ¦  1 ¦ DataB11 ¦
¦  1 ¦ DataA13 ¦  1 ¦ DataB11 ¦
¦  2 ¦ DataA21 ¦  2 ¦ DataB21 ¦
¦  2 ¦ DataA21 ¦  2 ¦ DataB22 ¦
¦  2 ¦ DataA21 ¦  2 ¦ DataB23 ¦
¦    ¦         ¦  4 ¦ DataB41 ¦
+-----------------------------+

Full Outer Join on column Id will return columns from both the tables and matching records with records from left table (Null values from right table) and records from right table (Null values from left table):

+-----------------------------+
¦ Id ¦  Data   ¦ Id ¦  Data   ¦
¦----+---------+----+---------¦
¦  - ¦         ¦    ¦         ¦
¦  1 ¦ DataA11 ¦  1 ¦ DataB11 ¦
¦  1 ¦ DataA12 ¦  1 ¦ DataB11 ¦
¦  1 ¦ DataA13 ¦  1 ¦ DataB11 ¦
¦  2 ¦ DataA21 ¦  2 ¦ DataB21 ¦
¦  2 ¦ DataA21 ¦  2 ¦ DataB22 ¦
¦  2 ¦ DataA21 ¦  2 ¦ DataB23 ¦
¦  3 ¦ DataA31 ¦    ¦         ¦
¦    ¦         ¦  4 ¦ DataB41 ¦
+-----------------------------+

Left Semi Join on column Id will return columns only from left table and matching records only from left table:

+--------------+
¦ Id ¦  Data   ¦
+----+---------¦
¦  1 ¦ DataA11 ¦
¦  1 ¦ DataA12 ¦
¦  1 ¦ DataA13 ¦
¦  2 ¦ DataA21 ¦
+--------------+

Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to hql

Difference between INNER JOIN and LEFT SEMI JOIN [Ljava.lang.Object; cannot be cast to HQL Hibernate INNER JOIN What is the difference between JOIN and JOIN FETCH when using JPA and Hibernate IN-clause in HQL or Java Persistence Query Language ORDER BY using Criteria API How do you do a limit query in JPQL or HQL? Hibernate HQL Query : How to set a Collection as a named parameter of a Query? How do you create a Distinct query in HQL JPA and Hibernate - Criteria vs. JPQL or HQL

Examples related to hive

select rows in sql with latest date for each ID repeated multiple times PySpark: withColumn() with two conditions and three outcomes java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient Hive cast string to date dd-MM-yyyy How to save DataFrame directly to Hive? How to calculate Date difference in Hive Select top 2 rows in Hive Just get column names from hive table Create hive table using "as select" or "like" and also specify delimiter Hive Alter table change Column Name