How to order and partition by multiple columns?

10.6K    Asked by DavidEDWARDS in SQL Server , Asked on Mar 20, 2023

I am using PostgreSQL.

I have a table with 2 columns in this example:

I want to add a new column with a unique id corresponding to partitions by name and category as shown in the result. Then, I want to take a random sample choosing 2 (or more) unique ids because under each unique id, there will be a lot of other historical data.


I have tried this so far, but I get 1s for everything. I'm missing something really simple here, but how do I correct my mistake? I have to do this operation for several million rows in the real table.


SELECT
  dense_rank() over (partition by name, category order by category) as unique_id,
  *
FROM
 example_table
After this, presumably I'll have to use RAND() somewhere but how do I do this?

This is my naive approach to get the solution in the above pic.

with ranks (
  ...
)
select * from ranks where unique_id = 2 or unique_id = 4
Answered by David EDWARDS

Use ROW_NUMBER() function better for order and partition by multiple columns.


SELECT
    name,
    category,
    ROW_NUMBER() OVER (PARTITION BY name, category ORDER BY name, category) unique_id
FROM
    test;
name | category | unique_id
:--- | :------- | --------:
A | Alpha | 1
A | Alpha | 2
A | Alpha | 3
A | Beta | 1
A | Beta | 2
B | Alpha | 1
B | Alpha | 2
B | Theta | 1
db<>fiddle here
To get a unique_id for each pair of name-category, remove the PARTITION part and use only ORDER BY:
SELECT
    name,
    category,
    DENSE_RANK() OVER (ORDER BY name, category) unique_id
FROM
    test;
name | category | unique_id
:--- | :------- | --------:
A | Alpha | 1
A | Alpha | 1
A | Alpha | 1
A | Beta | 2
A | Beta | 2
B | Alpha | 3
B | Alpha | 3
B | Theta | 4
db<>fiddle here
To select two rows of each unique_id: (As far as I've not used any order it is supposed to be random)
WITH ct AS
(
  SELECT
      name,
      category,
      DENSE_RANK() OVER (ORDER BY name, category) unique_id
  FROM
      test
)
SELECT
    a.unique_id, c.name, c.category
FROM
    (SELECT DISTINCT unique_id
     FROM ct) a
JOIN LATERAL (SELECT b.name, b.category
              FROM ct b
              WHERE b.unique_id = a.unique_id
              LIMIT 2) c ON TRUE
ORDER BY
    a.unique_id;
unique_id | name | category
--------: | :--- | :-------
        1 | A | Alpha
        1 | A | Alpha
        2 | A | Beta
        2 | A | Beta
        3 | B | Alpha
        3 | B | Alpha
        4 | B | Theta


Your Answer

Answer (1)

To order and partition by multiple columns, you can use the ORDER BY and PARTITION BY clauses in your SQL query. Here's how you can do it:

SELECT column1, column2, column3
FROM your_table
ORDER BY column1, column2
PARTITION BY column3;

In this example, column1 and column2 are used for sorting the rows, and column3 is used for partitioning the result set. This means that within each partition defined by the values of column3, the rows will be ordered based on the values of column1 and column2.









4 Hours

Interviews

Parent Categories