[sql] In SQL, how can you "group by" in ranges?

Suppose I have a table with a numeric column (lets call it "score").

I'd like to generate a table of counts, that shows how many times scores appeared in each range.

For example:

score range  | number of occurrences
-------------------------------------
   0-9       |        11
  10-19      |        14
  20-29      |         3
   ...       |       ...

In this example there were 11 rows with scores in the range of 0 to 9, 14 rows with scores in the range of 10 to 19, and 3 rows with scores in the range 20-29.

Is there an easy way to set this up? What do you recommend?

This question is related to sql sql-server tsql

The answer is


Perhaps you're asking about keeping such things going...

Of course you'll invoke a full table scan for the queries and if the table containing the scores that need to be tallied (aggregations) is large you might want a better performing solution, you can create a secondary table and use rules, such as on insert - you might look into it.

Not all RDBMS engines have rules, though!


James Curran's answer was the most concise in my opinion, but the output wasn't correct. For SQL Server the simplest statement is as follows:

SELECT 
    [score range] = CAST((Score/10)*10 AS VARCHAR) + ' - ' + CAST((Score/10)*10+9 AS VARCHAR), 
    [number of occurrences] = COUNT(*)
FROM #Scores
GROUP BY Score/10
ORDER BY Score/10

This assumes a #Scores temporary table I used to test it, I just populated 100 rows with random number between 0 and 99.


select cast(score/10 as varchar) + '-' + cast(score/10+9 as varchar), 
       count(*)
from scores
group by score/10

Try

SELECT (str(range) + "-" + str(range + 9) ) AS [Score range], COUNT(score) AS [number of occurances]
FROM (SELECT  score,  int(score / 10 ) * 10  AS range  FROM scoredata )  
GROUP BY range;

select cast(score/10 as varchar) + '-' + cast(score/10+9 as varchar), 
       count(*)
from scores
group by score/10

I'm here because i have similar question but i find the short answers wrong and the one with the continuous "case when" is to much work and seeing anything repetitive in my code hurts my eyes. So here is the solution

SELECT --MIN(score), MAX(score),
    [score range] = CAST(ROUND(score-5,-1)AS VARCHAR) + ' - ' + CAST((ROUND(score-5,-1)+10)AS VARCHAR),
    [number of occurrences] = COUNT(*)
FROM order
GROUP BY  CAST(ROUND(score-5,-1)AS VARCHAR) + ' - ' + CAST((ROUND(score-5,-1)+10)AS VARCHAR)
ORDER BY MIN(score)



This will allow you to not have to specify ranges, and should be SQL server agnostic. Math FTW!

SELECT CONCAT(range,'-',range+9), COUNT(range)
FROM (
  SELECT 
    score - (score % 10) as range
  FROM scores
)

declare @RangeWidth int

set @RangeWidth = 10

select
   Floor(Score/@RangeWidth) as LowerBound,
   Floor(Score/@RangeWidth)+@RangeWidth as UpperBound,
   Count(*)
From
   ScoreTable
group by
   Floor(Score/@RangeWidth)

Because the column being sorted on (Range) is a string, string/word sorting is used instead of numeric sorting.

As long as the strings have zeros to pad out the number lengths the sorting should still be semantically correct:

SELECT t.range AS ScoreRange,
       COUNT(*) AS NumberOfOccurrences
  FROM (SELECT CASE
                    WHEN score BETWEEN 0 AND 9 THEN '00-09'
                    WHEN score BETWEEN 10 AND 19 THEN '10-19'
                    ELSE '20-99'
               END AS Range
          FROM Scores) t
 GROUP BY t.Range

If the range is mixed, simply pad an extra zero:

SELECT t.range AS ScoreRange,
       COUNT(*) AS NumberOfOccurrences
  FROM (SELECT CASE
                    WHEN score BETWEEN 0 AND 9 THEN '000-009'
                    WHEN score BETWEEN 10 AND 19 THEN '010-019'
                    WHEN score BETWEEN 20 AND 99 THEN '020-099'
                    ELSE '100-999'
               END AS Range
          FROM Scores) t
 GROUP BY t.Range

declare @RangeWidth int

set @RangeWidth = 10

select
   Floor(Score/@RangeWidth) as LowerBound,
   Floor(Score/@RangeWidth)+@RangeWidth as UpperBound,
   Count(*)
From
   ScoreTable
group by
   Floor(Score/@RangeWidth)

An alternative approach would involve storing the ranges in a table, instead of embedding them in the query. You would end up with a table, call it Ranges, that looks like this:

LowerLimit   UpperLimit   Range 
0              9          '0-9'
10            19          '10-19'
20            29          '20-29'
30            39          '30-39'

And a query that looks like this:

Select
   Range as [Score Range],
   Count(*) as [Number of Occurences]
from
   Ranges r inner join Scores s on s.Score between r.LowerLimit and r.UpperLimit
group by Range

This does mean setting up a table, but it would be easy to maintain when the desired ranges change. No code changes necessary!


I'm here because i have similar question but i find the short answers wrong and the one with the continuous "case when" is to much work and seeing anything repetitive in my code hurts my eyes. So here is the solution

SELECT --MIN(score), MAX(score),
    [score range] = CAST(ROUND(score-5,-1)AS VARCHAR) + ' - ' + CAST((ROUND(score-5,-1)+10)AS VARCHAR),
    [number of occurrences] = COUNT(*)
FROM order
GROUP BY  CAST(ROUND(score-5,-1)AS VARCHAR) + ' - ' + CAST((ROUND(score-5,-1)+10)AS VARCHAR)
ORDER BY MIN(score)



An alternative approach would involve storing the ranges in a table, instead of embedding them in the query. You would end up with a table, call it Ranges, that looks like this:

LowerLimit   UpperLimit   Range 
0              9          '0-9'
10            19          '10-19'
20            29          '20-29'
30            39          '30-39'

And a query that looks like this:

Select
   Range as [Score Range],
   Count(*) as [Number of Occurences]
from
   Ranges r inner join Scores s on s.Score between r.LowerLimit and r.UpperLimit
group by Range

This does mean setting up a table, but it would be easy to maintain when the desired ranges change. No code changes necessary!


create table scores (
   user_id int,
   score int
)

select t.range as [score range], count(*) as [number of occurences]
from (
      select user_id,
         case when score >= 0 and score < 10 then '0-9'
         case when score >= 10 and score < 20 then '10-19'
         ...
         else '90-99' as range
     from scores) t
group by t.range

Perhaps you're asking about keeping such things going...

Of course you'll invoke a full table scan for the queries and if the table containing the scores that need to be tallied (aggregations) is large you might want a better performing solution, you can create a secondary table and use rules, such as on insert - you might look into it.

Not all RDBMS engines have rules, though!


In postgres (where || is the string concatenation operator):

select (score/10)*10 || '-' || (score/10)*10+9 as scorerange, count(*)
from scores
group by score/10
order by 1

gives:

 scorerange | count 
------------+-------
 0-9        |    11
 10-19      |    14
 20-29      |     3
 30-39      |     2

I see answers here that won't work in SQL Server's syntax. I would use:

select t.range as [score range], count(*) as [number of occurences]
from (
  select case 
    when score between  0 and  9 then ' 0-9 '
    when score between 10 and 19 then '10-19'
    when score between 20 and 29 then '20-29'
    ...
    else '90-99' end as range
  from scores) t
group by t.range

EDIT: see comments


select t.blah as [score range], count(*) as [number of occurences]
from (
  select case 
    when score between  0 and  9 then ' 0-9 '
    when score between 10 and 19 then '10-19'
    when score between 20 and 29 then '20-29'
    ...
    else '90-99' end as blah
  from scores) t
group by t.blah

Make sure you use a word other than 'range' if you are in MySQL, or you will get an error for running the above example.


In postgres (where || is the string concatenation operator):

select (score/10)*10 || '-' || (score/10)*10+9 as scorerange, count(*)
from scores
group by score/10
order by 1

gives:

 scorerange | count 
------------+-------
 0-9        |    11
 10-19      |    14
 20-29      |     3
 30-39      |     2

select cast(score/10 as varchar) + '-' + cast(score/10+9 as varchar), 
       count(*)
from scores
group by score/10

Because the column being sorted on (Range) is a string, string/word sorting is used instead of numeric sorting.

As long as the strings have zeros to pad out the number lengths the sorting should still be semantically correct:

SELECT t.range AS ScoreRange,
       COUNT(*) AS NumberOfOccurrences
  FROM (SELECT CASE
                    WHEN score BETWEEN 0 AND 9 THEN '00-09'
                    WHEN score BETWEEN 10 AND 19 THEN '10-19'
                    ELSE '20-99'
               END AS Range
          FROM Scores) t
 GROUP BY t.Range

If the range is mixed, simply pad an extra zero:

SELECT t.range AS ScoreRange,
       COUNT(*) AS NumberOfOccurrences
  FROM (SELECT CASE
                    WHEN score BETWEEN 0 AND 9 THEN '000-009'
                    WHEN score BETWEEN 10 AND 19 THEN '010-019'
                    WHEN score BETWEEN 20 AND 99 THEN '020-099'
                    ELSE '100-999'
               END AS Range
          FROM Scores) t
 GROUP BY t.Range

Perhaps you're asking about keeping such things going...

Of course you'll invoke a full table scan for the queries and if the table containing the scores that need to be tallied (aggregations) is large you might want a better performing solution, you can create a secondary table and use rules, such as on insert - you might look into it.

Not all RDBMS engines have rules, though!


create table scores (
   user_id int,
   score int
)

select t.range as [score range], count(*) as [number of occurences]
from (
      select user_id,
         case when score >= 0 and score < 10 then '0-9'
         case when score >= 10 and score < 20 then '10-19'
         ...
         else '90-99' as range
     from scores) t
group by t.range

select t.range as score, count(*) as Count 
from (
      select UserId,
         case when isnull(score ,0) >= 0 and isnull(score ,0)< 5 then '0-5'
                when isnull(score ,0) >= 5 and isnull(score ,0)< 10 then '5-10'
                when isnull(score ,0) >= 10 and isnull(score ,0)< 15 then '10-15'
                when isnull(score ,0) >= 15 and isnull(score ,0)< 20 then '15-20'               
         else ' 20+' end as range
         ,case when isnull(score ,0) >= 0 and isnull(score ,0)< 5 then 1
                when isnull(score ,0) >= 5 and isnull(score ,0)< 10 then 2
                when isnull(score ,0) >= 10 and isnull(score ,0)< 15 then 3
                when isnull(score ,0) >= 15 and isnull(score ,0)< 20 then 4             
         else 5  end as pd
     from score table
     ) t

group by t.range,pd order by pd

I see answers here that won't work in SQL Server's syntax. I would use:

select t.range as [score range], count(*) as [number of occurences]
from (
  select case 
    when score between  0 and  9 then ' 0-9 '
    when score between 10 and 19 then '10-19'
    when score between 20 and 29 then '20-29'
    ...
    else '90-99' end as range
  from scores) t
group by t.range

EDIT: see comments


James Curran's answer was the most concise in my opinion, but the output wasn't correct. For SQL Server the simplest statement is as follows:

SELECT 
    [score range] = CAST((Score/10)*10 AS VARCHAR) + ' - ' + CAST((Score/10)*10+9 AS VARCHAR), 
    [number of occurrences] = COUNT(*)
FROM #Scores
GROUP BY Score/10
ORDER BY Score/10

This assumes a #Scores temporary table I used to test it, I just populated 100 rows with random number between 0 and 99.


I would do this a little differently so that it scales without having to define every case:

select t.range as [score range], count(*) as [number of occurences]
from (
  select FLOOR(score/10) as range
  from scores) t
group by t.range

Not tested, but you get the idea...


An alternative approach would involve storing the ranges in a table, instead of embedding them in the query. You would end up with a table, call it Ranges, that looks like this:

LowerLimit   UpperLimit   Range 
0              9          '0-9'
10            19          '10-19'
20            29          '20-29'
30            39          '30-39'

And a query that looks like this:

Select
   Range as [Score Range],
   Count(*) as [Number of Occurences]
from
   Ranges r inner join Scores s on s.Score between r.LowerLimit and r.UpperLimit
group by Range

This does mean setting up a table, but it would be easy to maintain when the desired ranges change. No code changes necessary!


Try

SELECT (str(range) + "-" + str(range + 9) ) AS [Score range], COUNT(score) AS [number of occurances]
FROM (SELECT  score,  int(score / 10 ) * 10  AS range  FROM scoredata )  
GROUP BY range;

An alternative approach would involve storing the ranges in a table, instead of embedding them in the query. You would end up with a table, call it Ranges, that looks like this:

LowerLimit   UpperLimit   Range 
0              9          '0-9'
10            19          '10-19'
20            29          '20-29'
30            39          '30-39'

And a query that looks like this:

Select
   Range as [Score Range],
   Count(*) as [Number of Occurences]
from
   Ranges r inner join Scores s on s.Score between r.LowerLimit and r.UpperLimit
group by Range

This does mean setting up a table, but it would be easy to maintain when the desired ranges change. No code changes necessary!


In postgres (where || is the string concatenation operator):

select (score/10)*10 || '-' || (score/10)*10+9 as scorerange, count(*)
from scores
group by score/10
order by 1

gives:

 scorerange | count 
------------+-------
 0-9        |    11
 10-19      |    14
 20-29      |     3
 30-39      |     2

select cast(score/10 as varchar) + '-' + cast(score/10+9 as varchar), 
       count(*)
from scores
group by score/10

I would do this a little differently so that it scales without having to define every case:

select t.range as [score range], count(*) as [number of occurences]
from (
  select FLOOR(score/10) as range
  from scores) t
group by t.range

Not tested, but you get the idea...


declare @RangeWidth int

set @RangeWidth = 10

select
   Floor(Score/@RangeWidth) as LowerBound,
   Floor(Score/@RangeWidth)+@RangeWidth as UpperBound,
   Count(*)
From
   ScoreTable
group by
   Floor(Score/@RangeWidth)

Perhaps you're asking about keeping such things going...

Of course you'll invoke a full table scan for the queries and if the table containing the scores that need to be tallied (aggregations) is large you might want a better performing solution, you can create a secondary table and use rules, such as on insert - you might look into it.

Not all RDBMS engines have rules, though!


select t.range as score, count(*) as Count 
from (
      select UserId,
         case when isnull(score ,0) >= 0 and isnull(score ,0)< 5 then '0-5'
                when isnull(score ,0) >= 5 and isnull(score ,0)< 10 then '5-10'
                when isnull(score ,0) >= 10 and isnull(score ,0)< 15 then '10-15'
                when isnull(score ,0) >= 15 and isnull(score ,0)< 20 then '15-20'               
         else ' 20+' end as range
         ,case when isnull(score ,0) >= 0 and isnull(score ,0)< 5 then 1
                when isnull(score ,0) >= 5 and isnull(score ,0)< 10 then 2
                when isnull(score ,0) >= 10 and isnull(score ,0)< 15 then 3
                when isnull(score ,0) >= 15 and isnull(score ,0)< 20 then 4             
         else 5  end as pd
     from score table
     ) t

group by t.range,pd order by pd

select t.blah as [score range], count(*) as [number of occurences]
from (
  select case 
    when score between  0 and  9 then ' 0-9 '
    when score between 10 and 19 then '10-19'
    when score between 20 and 29 then '20-29'
    ...
    else '90-99' end as blah
  from scores) t
group by t.blah

Make sure you use a word other than 'range' if you are in MySQL, or you will get an error for running the above example.


declare @RangeWidth int

set @RangeWidth = 10

select
   Floor(Score/@RangeWidth) as LowerBound,
   Floor(Score/@RangeWidth)+@RangeWidth as UpperBound,
   Count(*)
From
   ScoreTable
group by
   Floor(Score/@RangeWidth)

James Curran's answer was the most concise in my opinion, but the output wasn't correct. For SQL Server the simplest statement is as follows:

SELECT 
    [score range] = CAST((Score/10)*10 AS VARCHAR) + ' - ' + CAST((Score/10)*10+9 AS VARCHAR), 
    [number of occurrences] = COUNT(*)
FROM #Scores
GROUP BY Score/10
ORDER BY Score/10

This assumes a #Scores temporary table I used to test it, I just populated 100 rows with random number between 0 and 99.


In postgres (where || is the string concatenation operator):

select (score/10)*10 || '-' || (score/10)*10+9 as scorerange, count(*)
from scores
group by score/10
order by 1

gives:

 scorerange | count 
------------+-------
 0-9        |    11
 10-19      |    14
 20-29      |     3
 30-39      |     2

This will allow you to not have to specify ranges, and should be SQL server agnostic. Math FTW!

SELECT CONCAT(range,'-',range+9), COUNT(range)
FROM (
  SELECT 
    score - (score % 10) as range
  FROM scores
)

I see answers here that won't work in SQL Server's syntax. I would use:

select t.range as [score range], count(*) as [number of occurences]
from (
  select case 
    when score between  0 and  9 then ' 0-9 '
    when score between 10 and 19 then '10-19'
    when score between 20 and 29 then '20-29'
    ...
    else '90-99' end as range
  from scores) t
group by t.range

EDIT: see comments


James Curran's answer was the most concise in my opinion, but the output wasn't correct. For SQL Server the simplest statement is as follows:

SELECT 
    [score range] = CAST((Score/10)*10 AS VARCHAR) + ' - ' + CAST((Score/10)*10+9 AS VARCHAR), 
    [number of occurrences] = COUNT(*)
FROM #Scores
GROUP BY Score/10
ORDER BY Score/10

This assumes a #Scores temporary table I used to test it, I just populated 100 rows with random number between 0 and 99.


create table scores (
   user_id int,
   score int
)

select t.range as [score range], count(*) as [number of occurences]
from (
      select user_id,
         case when score >= 0 and score < 10 then '0-9'
         case when score >= 10 and score < 20 then '10-19'
         ...
         else '90-99' as range
     from scores) t
group by t.range

Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to sql-server

Passing multiple values for same variable in stored procedure SQL permissions for roles Count the Number of Tables in a SQL Server Database Visual Studio 2017 does not have Business Intelligence Integration Services/Projects ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database How to create temp table using Create statement in SQL Server? SQL Query Where Date = Today Minus 7 Days How do I pass a list as a parameter in a stored procedure? SQL Server date format yyyymmdd

Examples related to tsql

Passing multiple values for same variable in stored procedure Count the Number of Tables in a SQL Server Database Change Date Format(DD/MM/YYYY) in SQL SELECT Statement Stored procedure with default parameters Format number as percent in MS SQL Server EXEC sp_executesql with multiple parameters SQL Server after update trigger How to compare datetime with only date in SQL Server Text was truncated or one or more characters had no match in the target code page including the primary key in an unpivot Printing integer variable and string on same line in SQL