[sql] SQL JOIN, GROUP BY on three tables to get totals

I know this is late, but it does answer your original question.

/*Read the comments the same way that SQL runs the query
    1) FROM 
    2) GROUP 
    3) SELECT 
    4) My final notes at the bottom 
*/
SELECT 
        list.invoiceid
    ,   cust.customernumber 
    ,   MAX(list.inv_amount) AS invoice_amount/* we select the max because it will be the same for each payment to that invoice (presumably invoice amounts do not vary based on payment) */
    ,   MAX(list.inv_amount) - SUM(list.pay_amount)  AS [amount_due]
FROM 
Customers AS cust 
    INNER JOIN 
Payments  AS pay 
    ON 
        pay.customerid = cust.customerid
INNER JOIN  (   /* generate a list of payment_ids, their amounts, and the totals of the invoices they billed to*/
    SELECT 
            inpay.paymentid AS paymentid
        ,   inv.invoiceid AS invoiceid 
        ,   inv.amount  AS inv_amount 
        ,   pay.amount AS pay_amount 
    FROM 
    InvoicePayments AS inpay
        INNER JOIN 
    Invoices AS inv 
        ON  inv.invoiceid = inpay.invoiceid 
        INNER JOIN 
    Payments AS pay 
        ON pay.paymentid = inpay.paymentid
    )  AS list
ON 
    list.paymentid = pay.paymentid
    /* so at this point my result set would look like: 
    -- All my customers (crossed by) every paymentid they are associated to (I'll call this A)
    -- Every invoice payment and its association to: its own ammount, the total invoice ammount, its own paymentid (what I call list) 
    -- Filter out all records in A that do not have a paymentid matching in (list)
     -- we filter the result because there may be payments that did not go towards invoices!
 */
GROUP BY
    /* we want a record line for each customer and invoice ( or basically each invoice but i believe this makes more sense logically */ 
        cust.customernumber 
    ,   list.invoiceid 
/*
    -- we can improve this query by only hitting the Payments table once by moving it inside of our list subquery, 
    -- but this is what made sense to me when I was planning. 
    -- Hopefully it makes it clearer how the thought process works to leave it in there
    -- as several people have already pointed out, the data structure of the DB prevents us from looking at customers with invoices that have no payments towards them.
*/

Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to join

Pandas Merging 101 pandas: merge (join) two data frames on multiple columns How to use the COLLATE in a JOIN in SQL Server? How to join multiple collections with $lookup in mongodb How to join on multiple columns in Pyspark? Pandas join issue: columns overlap but no suffix specified MySQL select rows where left join is null How to return rows from left table not found in right table? Why do multiple-table joins produce duplicate rows? pandas three-way joining multiple dataframes on columns

Examples related to aggregate

Pandas group-by and sum SELECT list is not in GROUP BY clause and contains nonaggregated column Aggregate multiple columns at once Pandas sum by groupby, but exclude certain columns Extract the maximum value within each group in a dataframe How to group dataframe rows into list in pandas groupby Mean per group in a data.frame Summarizing multiple columns with dplyr? data.frame Group By column Compute mean and standard deviation by group for multiple variables in a data.frame