How to select distinct on one column & any in another column?

1.9K    Asked by GillianHamer in SQL Server , Asked on Oct 3, 2022

 I need to query an SQL database to find all distinct values of one column and I need an arbitrary value from another column. For example, consider the following table with two columns, key and value:

key     value

===     =====

one     test

one     another

one     value

two     goes

two     here

two     also

three   example

I wish to get back one sample row, chosen arbitrarily, from each distinct key, perhaps getting these three rows:

key     value

===     =====

one     test

two     goes

three   example

How can I formulate such a query in SQL?


Answered by Gloria Holl

To select distinct on one column & any in another column -


The easiest query to write is for MySQL (with not strict ANSI settings). It uses the non-standard construction:

SELECT key, value

FROM tableX

GROUP BY key ;

In recent version (5.7 and 8.0+) where the strict settings and ONLY_FULL_GROUP_BY are the default, you can use the ANY_VALUE() function, added in 5.7:

SELECT key, ANY_VALUE(value) AS value

FROM tableX

GROUP BY key ;

For other DBMSs that have window functions (like Postgres, SQL-Server, Oracle, DB2), you can use them like this. The advantage is that you can select other columns in the result as well (besides the key and value) :

SELECT key, value

FROM tableX
    ( SELECT key, value,
             ROW_NUMBER() OVER (PARTITION BY key
                                ORDER BY whatever) --- ORDER BY NULL
               AS rn --- for example
      FROM tableX
    ) tmp

WHERE rn = 1 ;

For older versions of the above and for any other DBMS, a general way that works almost everywhere. One disadvantage is that you cannot select other columns with this approach. Another is that aggregate functions like MIN() and MAX() do not work with some data types in some DBMSs (like bit, text, blobs):

SELECT key, MIN(value) AS value

FROM tableX

GROUP BY key ;

PostgreSQL has a special non-standard DISTINCT ON operator that can also be used. The optional ORDER BY is for selecting which row from every group should be selected:

SELECT DISTINCT ON (key) key, value

FROM tableX

-- ORDER BY key, ;



Your Answer

Interviews

Parent Categories