mardi 23 décembre 2014

T-SQL brain teaser: find non-overlapping sets


I would like to ask for help coming up with a query, which can identify groups on non-overlapping records. Here is a sample scenario (admittedly contrived). Let's say I have employees who are assigned to work on various projects. While an employee can be assigned to multiple projects, he/she can only work on one project at any given time (don't you wish we all had this luxury :). I need to find out which projects can be scheduled to be worked on in parallel because they do not share any employees. Here is some code to setup sample tables and data.



--1. Create #Projects table
IF OBJECT_ID('tempdb..#Projects') IS NOT NULL
DROP TABLE #Projects

CREATE TABLE #Projects (
ProjectId INT,
ProjectName VARCHAR(16)
)

INSERT INTO #Projects ( ProjectId, ProjectName )
VALUES ( 1, 'Project 1'), ( 2, 'Project 2'), ( 3, 'Project 3'), ( 4, 'Project 4'), ( 5, 'Project 5'),
( 6, 'Project 6'), ( 7, 'Project 7')

--2. Create #Employees table
IF OBJECT_ID('tempdb..#Employees') IS NOT NULL
DROP TABLE #Employees

CREATE TABLE #Employees (
EmployeeId INT,
EmployeeName VARCHAR(16)
)

INSERT INTO #Employees (EmployeeId, EmployeeName)
VALUES (101, 'Employee 101'), (102, 'Employee 102'), (103, 'Employee 103'), (104, 'Employee 104'),
(105, 'Employee 105'), (106, 'Employee 106'), (107, 'Employee 107')

--3. Create #Employee_Projects table
IF OBJECT_ID('tempdb..#Employee_Projects') IS NOT NULL
DROP TABLE #Employee_Projects

CREATE TABLE #Employee_Projects (
ProjectId INT,
EmployeeId INT
)

INSERT INTO #Employee_Projects (ProjectId, EmployeeId)
VALUES (1, 101), (1, 105), (1, 107), (2, 102), (2, 103), (2, 107), (3, 104), (3, 105), (3, 106), (4, 100), (4, 101), (4, 102), (5, 103), (5, 104), (6, 105), (6, 106), (7, 106), (7, 107), (8, 102), (8, 104), (8, 106)


And here is the query, which will show you the employees and projects we have created:



SELECT p.ProjectId, p.ProjectName, e.EmployeeId, e.EmployeeName
FROM #Projects p
JOIN #Employee_Projects ep ON ep.ProjectId = p.ProjectId
JOIN #Employees e ON e.EmployeeId = ep.EmployeeId
ORDER BY ep.ProjectId, e.EmployeeId


Our data looks like this:



ProjectId ProjectName EmployeeId EmployeeName
----------- ---------------- ----------- ----------------
1 Project 1 101 Employee 101
1 Project 1 105 Employee 105
1 Project 1 107 Employee 107
2 Project 2 102 Employee 102
2 Project 2 103 Employee 103
2 Project 2 107 Employee 107
3 Project 3 104 Employee 104
3 Project 3 105 Employee 105
3 Project 3 106 Employee 106
4 Project 4 101 Employee 101
4 Project 4 102 Employee 102
5 Project 5 103 Employee 103
5 Project 5 104 Employee 104
6 Project 6 105 Employee 105
6 Project 6 106 Employee 106
7 Project 7 106 Employee 106
7 Project 7 107 Employee 107
8 Project 8 102 Employee 102
8 Project 8 104 Employee 104
8 Project 8 106 Employee 106


Visually we can tell that, for example, we can schedule projects 1, 2, 3 to run at the same time because they do not share any employees. Let's call this group of projects "Group 1". After that we could schedule projects 4, 5, 6. Let's call it "Group 2". Finally, in our example we have project 7 left and this would be our "Group 3". My question is, how can I write a T-SQL query to perform such grouping of projects?


Thank you!





Aucun commentaire:

Enregistrer un commentaire