I'm trying to select a column from a single table (no joins) and I need the count of the number of rows, ideally before I begin retrieving the rows. I have come to two approaches that provide the information I need.
Approach 1:
SELECT COUNT( my_table.my_col ) AS row_count
FROM my_table
WHERE my_table.foo = 'bar'
Then
SELECT my_table.my_col
FROM my_table
WHERE my_table.foo = 'bar'
Or Approach 2
SELECT my_table.my_col, ( SELECT COUNT ( my_table.my_col )
FROM my_table
WHERE my_table.foo = 'bar' ) AS row_count
FROM my_table
WHERE my_table.foo = 'bar'
I am doing this because my SQL driver (SQL Native Client 9.0) does not allow me to use SQLRowCount on a SELECT statement but I need to know the number of rows in my result in order to allocate an array before assigning information to it. The use of a dynamically allocated container is, unfortunately, not an option in this area of my program.
I am concerned that the following scenario might occur:
Does Approach 2 prohibit this issue?
Also, Will one of the two approaches be faster? If so, which?
Finally, is there a better approach that I should consider (perhaps a way to instruct the driver to return the number of rows in a SELECT result using SQLRowCount?)
For those that asked, I am using Native C++ with the aforementioned SQL driver (provided by Microsoft.)
You might want to think about a better pattern for dealing with data of this type.
No self-prespecting SQL driver will tell you how many rows your query will return before returning the rows, because the answer might change (unless you use a Transaction, which creates problems of its own.)
The number of rows won't change - google for ACID and SQL.
If you're concerned the number of rows that meet the condition may change in the few milliseconds since execution of the query and retrieval of results, you could/should execute the queries inside a transaction:
BEGIN TRAN bogus
SELECT COUNT( my_table.my_col ) AS row_count
FROM my_table
WHERE my_table.foo = 'bar'
SELECT my_table.my_col
FROM my_table
WHERE my_table.foo = 'bar'
ROLLBACK TRAN bogus
This would return the correct values, always.
Furthermore, if you're using SQL Server, you can use @@ROWCOUNT to get the number of rows affected by last statement, and redirect the output of real query to a temp table or table variable, so you can return everything altogether, and no need of a transaction:
DECLARE @dummy INT
SELECT my_table.my_col
INTO #temp_table
FROM my_table
WHERE my_table.foo = 'bar'
SET @dummy=@@ROWCOUNT
SELECT @dummy, * FROM #temp_table
IF (@@ROWCOUNT > 0)
BEGIN
SELECT my_table.my_col
FROM my_table
WHERE my_table.foo = 'bar'
END
If you're using SQL Server, after your query you can select the @@RowCount function (or if your result set might have more than 2 billion rows use the RowCount_Big() function). This will return the number of rows selected by the previous statement or number of rows affected by an insert/update/delete statement.
SELECT my_table.my_col
FROM my_table
WHERE my_table.foo = 'bar'
SELECT @@Rowcount
Or if you want to row count included in the result sent similar to Approach #2, you can use the the OVER clause.
SELECT my_table.my_col,
count(*) OVER(PARTITION BY my_table.foo) AS 'Count'
FROM my_table
WHERE my_table.foo = 'bar'
Using the OVER clause will have much better performance than using a subquery to get the row count. Using the @@RowCount will have the best performance because the there won't be any query cost for the select @@RowCount statement
Update in response to comment: The example I gave would give the # of rows in partition - defined in this case by "PARTITION BY my_table.foo". The value of the column in each row is the # of rows with the same value of my_table.foo. Since your example query had the clause "WHERE my_table.foo = 'bar'", all rows in the resultset will have the same value of my_table.foo and therefore the value in the column will be the same for all rows and equal (in this case) this the # of rows in the query.
Here is a better/simpler example of how to include a column in each row that is the total # of rows in the resultset. Simply remove the optional Partition By clause.
SELECT my_table.my_col, count(*) OVER() AS 'Count'
FROM my_table
WHERE my_table.foo = 'bar'
Why don't you put your results into a vector? That way you don't have to know the size before hand.
If you're using SQL Server, after your query you can select the @@RowCount function (or if your result set might have more than 2 billion rows use the RowCount_Big() function). This will return the number of rows selected by the previous statement or number of rows affected by an insert/update/delete statement.
SELECT my_table.my_col
FROM my_table
WHERE my_table.foo = 'bar'
SELECT @@Rowcount
Or if you want to row count included in the result sent similar to Approach #2, you can use the the OVER clause.
SELECT my_table.my_col,
count(*) OVER(PARTITION BY my_table.foo) AS 'Count'
FROM my_table
WHERE my_table.foo = 'bar'
Using the OVER clause will have much better performance than using a subquery to get the row count. Using the @@RowCount will have the best performance because the there won't be any query cost for the select @@RowCount statement
Update in response to comment: The example I gave would give the # of rows in partition - defined in this case by "PARTITION BY my_table.foo". The value of the column in each row is the # of rows with the same value of my_table.foo. Since your example query had the clause "WHERE my_table.foo = 'bar'", all rows in the resultset will have the same value of my_table.foo and therefore the value in the column will be the same for all rows and equal (in this case) this the # of rows in the query.
Here is a better/simpler example of how to include a column in each row that is the total # of rows in the resultset. Simply remove the optional Partition By clause.
SELECT my_table.my_col, count(*) OVER() AS 'Count'
FROM my_table
WHERE my_table.foo = 'bar'
Just to add this because this is the top result in google for this question. In sqlite I used this to get the rowcount.
WITH temptable AS
(SELECT one,two
FROM
(SELECT one, two
FROM table3
WHERE dimension=0
UNION ALL SELECT one, two
FROM table2
WHERE dimension=0
UNION ALL SELECT one, two
FROM table1
WHERE dimension=0)
ORDER BY date DESC)
SELECT *
FROM temptable
LEFT JOIN
(SELECT count(*)/7 AS cnt,
0 AS bonus
FROM temptable) counter
WHERE 0 = counter.bonus
You might want to think about a better pattern for dealing with data of this type.
No self-prespecting SQL driver will tell you how many rows your query will return before returning the rows, because the answer might change (unless you use a Transaction, which creates problems of its own.)
The number of rows won't change - google for ACID and SQL.
If you're concerned the number of rows that meet the condition may change in the few milliseconds since execution of the query and retrieval of results, you could/should execute the queries inside a transaction:
BEGIN TRAN bogus
SELECT COUNT( my_table.my_col ) AS row_count
FROM my_table
WHERE my_table.foo = 'bar'
SELECT my_table.my_col
FROM my_table
WHERE my_table.foo = 'bar'
ROLLBACK TRAN bogus
This would return the correct values, always.
Furthermore, if you're using SQL Server, you can use @@ROWCOUNT to get the number of rows affected by last statement, and redirect the output of real query to a temp table or table variable, so you can return everything altogether, and no need of a transaction:
DECLARE @dummy INT
SELECT my_table.my_col
INTO #temp_table
FROM my_table
WHERE my_table.foo = 'bar'
SET @dummy=@@ROWCOUNT
SELECT @dummy, * FROM #temp_table
You might want to think about a better pattern for dealing with data of this type.
No self-prespecting SQL driver will tell you how many rows your query will return before returning the rows, because the answer might change (unless you use a Transaction, which creates problems of its own.)
The number of rows won't change - google for ACID and SQL.
If you're using SQL Server, after your query you can select the @@RowCount function (or if your result set might have more than 2 billion rows use the RowCount_Big() function). This will return the number of rows selected by the previous statement or number of rows affected by an insert/update/delete statement.
SELECT my_table.my_col
FROM my_table
WHERE my_table.foo = 'bar'
SELECT @@Rowcount
Or if you want to row count included in the result sent similar to Approach #2, you can use the the OVER clause.
SELECT my_table.my_col,
count(*) OVER(PARTITION BY my_table.foo) AS 'Count'
FROM my_table
WHERE my_table.foo = 'bar'
Using the OVER clause will have much better performance than using a subquery to get the row count. Using the @@RowCount will have the best performance because the there won't be any query cost for the select @@RowCount statement
Update in response to comment: The example I gave would give the # of rows in partition - defined in this case by "PARTITION BY my_table.foo". The value of the column in each row is the # of rows with the same value of my_table.foo. Since your example query had the clause "WHERE my_table.foo = 'bar'", all rows in the resultset will have the same value of my_table.foo and therefore the value in the column will be the same for all rows and equal (in this case) this the # of rows in the query.
Here is a better/simpler example of how to include a column in each row that is the total # of rows in the resultset. Simply remove the optional Partition By clause.
SELECT my_table.my_col, count(*) OVER() AS 'Count'
FROM my_table
WHERE my_table.foo = 'bar'
Why don't you put your results into a vector? That way you don't have to know the size before hand.
If you are really concerned that your row count will change between the select count and the select statement, why not select your rows into a temp table first? That way, you know you will be in sync.
If you're concerned the number of rows that meet the condition may change in the few milliseconds since execution of the query and retrieval of results, you could/should execute the queries inside a transaction:
BEGIN TRAN bogus
SELECT COUNT( my_table.my_col ) AS row_count
FROM my_table
WHERE my_table.foo = 'bar'
SELECT my_table.my_col
FROM my_table
WHERE my_table.foo = 'bar'
ROLLBACK TRAN bogus
This would return the correct values, always.
Furthermore, if you're using SQL Server, you can use @@ROWCOUNT to get the number of rows affected by last statement, and redirect the output of real query to a temp table or table variable, so you can return everything altogether, and no need of a transaction:
DECLARE @dummy INT
SELECT my_table.my_col
INTO #temp_table
FROM my_table
WHERE my_table.foo = 'bar'
SET @dummy=@@ROWCOUNT
SELECT @dummy, * FROM #temp_table
Here are some ideas:
If you are really concerned that your row count will change between the select count and the select statement, why not select your rows into a temp table first? That way, you know you will be in sync.
Why don't you put your results into a vector? That way you don't have to know the size before hand.
You might want to think about a better pattern for dealing with data of this type.
No self-prespecting SQL driver will tell you how many rows your query will return before returning the rows, because the answer might change (unless you use a Transaction, which creates problems of its own.)
The number of rows won't change - google for ACID and SQL.
Approach 2 will always return a count that matches your result set.
I suggest you link the sub-query to your outer query though, to guarantee that the condition on your count matches the condition on the dataset.
SELECT
mt.my_row,
(SELECT COUNT(mt2.my_row) FROM my_table mt2 WHERE mt2.foo = mt.foo) as cnt
FROM my_table mt
WHERE mt.foo = 'bar';
IF (@@ROWCOUNT > 0)
BEGIN
SELECT my_table.my_col
FROM my_table
WHERE my_table.foo = 'bar'
END
Approach 2 will always return a count that matches your result set.
I suggest you link the sub-query to your outer query though, to guarantee that the condition on your count matches the condition on the dataset.
SELECT
mt.my_row,
(SELECT COUNT(mt2.my_row) FROM my_table mt2 WHERE mt2.foo = mt.foo) as cnt
FROM my_table mt
WHERE mt.foo = 'bar';
If you're concerned the number of rows that meet the condition may change in the few milliseconds since execution of the query and retrieval of results, you could/should execute the queries inside a transaction:
BEGIN TRAN bogus
SELECT COUNT( my_table.my_col ) AS row_count
FROM my_table
WHERE my_table.foo = 'bar'
SELECT my_table.my_col
FROM my_table
WHERE my_table.foo = 'bar'
ROLLBACK TRAN bogus
This would return the correct values, always.
Furthermore, if you're using SQL Server, you can use @@ROWCOUNT to get the number of rows affected by last statement, and redirect the output of real query to a temp table or table variable, so you can return everything altogether, and no need of a transaction:
DECLARE @dummy INT
SELECT my_table.my_col
INTO #temp_table
FROM my_table
WHERE my_table.foo = 'bar'
SET @dummy=@@ROWCOUNT
SELECT @dummy, * FROM #temp_table
Why don't you put your results into a vector? That way you don't have to know the size before hand.
Here are some ideas:
Here are some ideas:
If you are really concerned that your row count will change between the select count and the select statement, why not select your rows into a temp table first? That way, you know you will be in sync.
Here are some ideas:
Approach 2 will always return a count that matches your result set.
I suggest you link the sub-query to your outer query though, to guarantee that the condition on your count matches the condition on the dataset.
SELECT
mt.my_row,
(SELECT COUNT(mt2.my_row) FROM my_table mt2 WHERE mt2.foo = mt.foo) as cnt
FROM my_table mt
WHERE mt.foo = 'bar';
Source: Stackoverflow.com