Alternative to CASE WHEN? - sql

I have a table in SQL where the results look something like:
Number | Name | Name 2
1 | John | Derek
1 | John | NULL
2 | Jane | Louise
2 | Jane | NULL
3 | Michael | Mark
3 | Michael | NULL
4 | Sara | Paul
4 | Sara | NULL
I want a way to say that if Number=1, return Name 2 in new column Name 3, so that the results would look like:
Number | Name | Name 2 | Name 3
1 | John | Derek | Derek
1 | John | NULL | Derek
2 | Jane | Louise | Louise
2 | Jane | NULL | Louise
3 | Michael | Mark | Mark
3 | Michael | NULL | Mark
4 | Sara | Paul | Paul
4 | Sara | NULL | Paul
The problem is that I can't say if Number=1, return Name 2 in Name 3, because my table has >100,000 records. I need it to do it automatically. More like "if Number is the same, return Name 2 in Name 3." I've tried to use a CASE statement but haven't been able to figure it out. Is there any way to do this?

Empirically, this seems to work:
SELECT
Number, Name, [Name 2],
MAX([Name 2]) OVER (PARTITION BY Number) [Name 3]
FROM yourTable;
The idea here, if I interpreted your requirements correctly, is that you want to report the non NULL value of the second name for all records as the third name value.

Solution 3, with group by
with maxi as(
SELECT Number, max(Name2) name3
FROM #sample
group by number, name
)
SELECT f1.*, f2.name3
FROM #sample f1 inner join maxi f2 on f1.number=f2.number

Solution 4, with cross apply
SELECT *
FROM #sample f1 cross apply
(
select top 1 f2.Name2 as Name3 from #sample f2
where f2.number=f1.number and f2.Name2 is not null
) f3

you can try this:
Solution 1, with row_number
declare #sample table (Number integer, Name varchar(50), Name2 varchar(50))
insert into #sample
select 1 , 'John' , 'Derek' union all
select 1 , 'John' , NULL union all
select 2 , 'Jane' , 'Louise' union all
select 2 , 'Jane' , NULL union all
select 3 , 'Michael' , 'Mark' union all
select 3 , 'Michael' , NULL union all
select 4 , 'Sara' , 'Paul' union all
select 4 , 'Sara' , NULL ;
with tmp as (
select *, row_number() over(partition by number order by number) rang
from #sample
)
select f1.Number, f1.Name, f1.Name2, f2.Name2 as Name3
from tmp f1 inner join tmp f2 on f1.Number=f2.Number and f2.rang=1

Solution 2, with lag (if your sql server version has lag function)
SELECT
Number, Name, Name2,
isnull(Name2, lag(Name2) OVER (PARTITION BY Number order by number)) Name3
FROM #sample;

Related

Removing group of results if total is 0

I am using the following table to create a stacked bar chart - its quite a bit larger than this:
ID | Name | foodEaten | total
1 | Sam | Burger | 3
1 | Sam | Pizza | 1
1 | Sam | Kebab | 0
1 | Sam | Cheesecake| 3
1 | Sam | Sandwich | 5
2 | Jeff | Burger | 0
2 | Jeff | Pizza | 0
2 | Jeff | Kebab | 0
2 | Jeff | Cheesecake| 0
2 | Jeff | Sandwich | 0
I need to find a way to remove results like Jeff. Where the entire total for what he ate is 0. I can't think of the easiest way to achieve this. I've tried grouping the entire result by Id and creating a total, but its just not happening.
If the person has eaten a total of 0 food, then he needs to be excluded. But if he hasn't, and he hasn't eaten any kebabs, as shown in my above table, this needs to be included in the result!
So the output needed is:
ID | Name | foodEaten | total
1 | Sam | Burger | 3
1 | Sam | Pizza | 1
1 | Sam | Kebab | 0
1 | Sam | Cheesecake| 3
1 | Sam | Sandwich | 5
Assuming that you want the data as it appears, and not the aggregate out and then exclude:
WITH CTE AS (
SELECT ID,
[Name],
foodEaten,
total,
SUM(total) OVER (PARTITION BY [Name]) AS nameTotal
FROM YourTable)
SELECT ID,
[Name],
foodEaten,
total
FROM CTE
WHERE nameTotal > 0;
select id, name, foodEaten, sum(total) as total from <table> group by ID having sum(total) > 0
Does this work for you?
You can try below -
select id,name
from tablename a
group by id,name
having sum(total)>0
OR
DEMO
select * from tablename a
where not exists (select 1 from tablename b where a.id=b.id group by id,name
having sum(total)=0)
Try this
;WITH CTE (ID , Name , foodEaten , total)
AS
(
SELECT 1 , 'Sam' , 'Burger' , 3 UNION ALL
SELECT 1 , 'Sam' , 'Pizza' , 1 UNION ALL
SELECT 1 , 'Sam' , 'Kebab' , 2 UNION ALL
SELECT 1 , 'Sam' , 'Cheesecake', 3 UNION ALL
SELECT 1 , 'Sam' , 'Sandwich' , 5 UNION ALL
SELECT 2 , 'Jeff' , 'Burger' , 0 UNION ALL
SELECT 2 , 'Jeff' , 'Pizza' , 0 UNION ALL
SELECT 2 , 'Jeff' , 'Kebab' , 0 UNION ALL
SELECT 2 , 'Jeff' , 'Cheesecake', 0 UNION ALL
SELECT 2 , 'Jeff' , 'Sandwich' , 0
)
SELECT ID , Name ,SUM( total) AS Grandtotal
FROM CTE
GROUP BY ID , Name
HAVING SUM( total) >0
Result
ID Name Grandtotal
----------------------
1 Sam 14
Using DELETE with HAVING SUM(total) = 0 will remove the group of result which their total is 0
DELETE FROM TableName
WHERE ID IN (SELECT Id FROM TableName GROUP BY ID HAVING SUM(total) = 0)
or if you want to remvoe and select only the records which has sum of total is zero, then
SELECT * FROM TableName
WHERE ID NOT IN (SELECT Id FROM TableName GROUP BY ID HAVING SUM(total) = 0)
Assuming total is never negative, then probably the most efficient method is to use exists:
select t.*
from t
where exists (select 1
from t t2
where t2.name = t.name and
t2.total > 0
);
In particular, this can take advantage of an index on (name, total).

SQL: Pick highest and lowest value (int) from one row

I am looking for a way to pick the highest and lowest value (integer) from a single row in table. There are 4 columns that i need to compare together and get highest and lowest number there is.
The table looks something like this...
id | name | col_to_compare1 | col_to_compare2 | col_to_compare3 | col_to_compare4
1 | John | 5 | 5 | 2 | 1
2 | Peter | 3 | 2 | 4 | 1
3 | Josh | 3 | 5 | 1 | 3
Can you help me, please? Thanks!
You can do this using CROSS APPLY and the VALUES clause. Use VALUES to group all your compared columns and then select the max.
SELECT
MAX(d.data1) as MaxOfColumns
,MIN(d.data1) as MinOfColumns
,a.id
,a.name
FROM YOURTABLE as a
CROSS APPLY (
VALUES(a.col_to_compare1)
,(a.col_to_compare2)
,(a. col_to_compare3)
,(a.col_to_compare4)
,(a. col_to_compare5)
) as d(data1) --Name the Column
GROUP BY a.id
,a.name
Assuming you are looking for min/max per row
Declare #YourTable table (id int,name varchar(50),col_to_compare1 int,col_to_compare2 int,col_to_compare3 int,col_to_compare4 int)
Insert Into #YourTable values
(1,'John',5,5,2,1),
(2,'Peter',3,2,4,1),
(3,'Josh',3,5,1,3)
Select A.ID
,A.Name
,MinVal = min(B.N)
,MaxVal = max(B.N)
From #YourTable A
Cross Apply (Select N From (values(a.col_to_compare1),(a.col_to_compare2),(a.col_to_compare3),(a.col_to_compare4)) N(N) ) B
Group By A.ID,A.Name
Returns
ID Name MinVal MaxVal
1 John 1 5
3 Josh 1 5
2 Peter 1 4
These solutions keep the current rows and add additional columns of min/max.
select *
from t cross apply
(select min(col) as min_col
,max(col) as max_col
from (
values
(t.col_to_compare1)
,(t.col_to_compare2)
,(t.col_to_compare3)
,(t.col_to_compare4)
) c(col)
) c
OR
select *
,cast ('' as xml).value ('min ((sql:column("t.col_to_compare1"),sql:column("t.col_to_compare2"),sql:column("t.col_to_compare3"),sql:column("t.col_to_compare4")))','int') as min_col
,cast ('' as xml).value ('max ((sql:column("t.col_to_compare1"),sql:column("t.col_to_compare2"),sql:column("t.col_to_compare3"),sql:column("t.col_to_compare4")))','int') as max_col
from t
+----+-------+-----------------+-----------------+-----------------+-----------------+---------+---------+
| id | name | col_to_compare1 | col_to_compare2 | col_to_compare3 | col_to_compare4 | min_col | max_col |
+----+-------+-----------------+-----------------+-----------------+-----------------+---------+---------+
| 1 | John | 5 | 5 | 2 | 1 | 1 | 5 |
+----+-------+-----------------+-----------------+-----------------+-----------------+---------+---------+
| 2 | Peter | 3 | 2 | 4 | 1 | 1 | 4 |
+----+-------+-----------------+-----------------+-----------------+-----------------+---------+---------+
| 3 | Josh | 3 | 5 | 1 | 3 | 1 | 5 |
+----+-------+-----------------+-----------------+-----------------+-----------------+---------+---------+
A way to do this is to "break" apart the data
declare #table table (id int, name varchar(10), col1 int, col2 int, col3 int, col4 int)
insert into #table values (1 , 'John' , 5 , 5 , 2 , 1)
insert into #table values (2 , 'Peter' , 3 , 2 , 4 , 1)
insert into #table values (3 , 'Josh' , 3 , 5 , 1 , 3)
;with stretch as
(
select id, col1 as col from #table
union all
select id, col2 as col from #table
union all
select id, col3 as col from #table
union all
select id, col4 as col from #table
)
select
t.id,
t.name,
agg.MinCol,
agg.MaxCol
from #table t
inner join
(
select
id, min(col) as MinCol, max(col) as MaxCol
from stretch
group by id
) agg
on t.id = agg.id
Seems simple enough
SELECT min(col1), max(col1), min(col2), max(col2), min(col3), max(col3), min(col4), max(col4) FROM table
Gives you the Min and Max for each column.
Following OP's comment, I believe he may be looking for a min/max grouped by the person being queried against.
So that would be:
SELECT name, min(col1), max(col1), min(col2), max(col2), min(col3), max(col3), min(col4), max(col4) FROM table GROUP BY name

Self join next timestamp

I am looking to merge timestamp from 2 different row based on Employee and punch card but the max or limit does not work with the from statement, if I only use > then i get every subsequent for everyday... I want the next higher value on a self join, also I have to mention that i have to use SQL 2008! so the lag and Lead does not work!
please help me.
SELECT , Det.name
,Det.[time]
,Det2.[time]
,Det.[type]
,det2.type
,Det.[detail]
FROM [detail] Det
join [detail] Det2 on
Det2.name = Det.name
and
Det2.time > Det.time Max 1
where det.type <>3
Table detail
NAME | Time | Type | detail
john | 10:30| 1 | On
steve| 10:32| 1 | On
john | 10:34| 2 | break
paul | 10:35| 1 | On
steve| 10:45| 3 | Off
john | 10:49| 2 | on
paul | 10:55| 3 | Off
john | 11:12| 3 | Off
Wanted result
John | 10:30 | 10:34 | 1 | 2 | On
John | 10:34 | 10:49 | 2 | 1 | Break
John | 10:49 | 11:12 | 1 | 3 | on
Steve| 10:32 | 10:45 | 1 | 3 | on
Paul | 10:35 | 10:55 | 1 | 3 | On
Thank you in advance!
You can do it with cross apply:
SELECT Det.name
,Det.[time]
,ca.[time]
,Det.[type]
,ca.type
,Det.[detail]
FROM [detail] Det
Cross Apply(Select Top 1 * From detail det2 where det.Name = det2.Name Order By det2.Time) ca
Where det.Type <> 3
As you said LAG or LEAD functions won't work for you, but you could use ROW_NUMBER() OVER (PARTITION BY name ORDER BY time DESC) on both tables and then do a JOIN on RN1 = RN2 + 1
This is just a idea, but I don't see an issue why it shouldn't work.
Query:
;WITH Data (NAME, TIME, type, detail)
AS (
SELECT 'john', CAST('10:30' AS DATETIME2), 1, 'On'
UNION ALL
SELECT 'steve', '10:32', 1, 'On'
UNION ALL
SELECT 'john', '10:34', 2, 'break'
UNION ALL
SELECT 'paul', '10:35', 1, 'On'
UNION ALL
SELECT 'steve', '10:45', 3, 'Off'
UNION ALL
SELECT 'john', '10:49', 2, 'on'
UNION ALL
SELECT 'paul', '10:55', 3, 'Off'
UNION ALL
SELECT 'john', '11:12', 3, 'Off'
)
SELECT t.NAME, LTRIM(RIGHT(CONVERT(VARCHAR(25), t.TIME, 100), 7)) AS time, LTRIM(RIGHT(CONVERT(VARCHAR(25), t2.TIME, 100), 7)) AS time, t.type, t2.type, t.detail
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY NAME ORDER BY TIME) rn, *
FROM Data
) AS t
INNER JOIN (
SELECT ROW_NUMBER() OVER (PARTITION BY NAME ORDER BY TIME) rn, *
FROM Data
) AS t2
ON t2.NAME = t.NAME
AND t2.rn = t.rn + 1;
Result:
NAME time time type type detail
----------------------------------------------
john 10:30AM 10:34AM 1 2 On
john 10:34AM 10:49AM 2 2 break
john 10:49AM 11:12AM 2 3 on
paul 10:35AM 10:55AM 1 3 On
steve 10:32AM 10:45AM 1 3 On
Any comments, concerns - let me know. :)
As #evaldas-buinauskas said,
The OVER and LAG statements in SQL will work for you.
Here is a similar example:
http://www.databasejournal.com/features/mssql/lead-and-lag-functions-in-sql-server-2012.html

Data Matching with SQL and assigning Identity ID's

How to write a query that will match data and produce and identity for it.
For Example:
RecordID | Name
1 | John
2 | John
3 | Smith
4 | Smith
5 | Smith
6 | Carl
I want a query which will assign an identity after matching exactly on Name.
Expected Output:
RecordID | Name | ID
1 | John | 1X
2 | John | 1X
3 | Smith | 1Y
4 | Smith | 1Y
5 | Smith | 1Y
6 | Carl | 1Z
Note: The ID should be unique for every match. Also, it can be numbers or varchar.
Can somebody help me with this? The main thing is to assign the ID's.
Thanks.
How about this:
with temp as
(
select 1 as id,'John' as name
union
select 2,'John'
union
select 3,'Smith'
union
select 4,'Smith'
union
select 5,'Smith'
union
select 6,'Carl'
)
SELECT *, DENSE_RANK() OVER
(ORDER BY Name) as NewId
FROM TEMP
Order by id
The first part is for testing purposes only.
Please try:
SELECT *,
Rank() over (order by Name ASC)
FROM table
This structure seems to work:
CREATE TABLE #Table
(
Department VARCHAR(100),
Name VARCHAR(100)
);
INSERT INTO #Table VALUES
('Sales','michaeljackson'),
('Sales','michaeljackson'),
('Sales','jim'),
('Sales','jim'),
('Sales','jill'),
('Sales','jill'),
('Sales','jill'),
('Sales','j');
WITH Cte_Rank AS
(
SELECT [Name],
rw = ROW_NUMBER() OVER (ORDER BY [Name])
FROM #Table
GROUP BY [Name]
)
SELECT a.Department,
a.Name,
b.rw
FROM #Table a
INNER JOIN Cte_Rank b
ON a.Name = b.Name;

Query for missing elements

I have a table with the following structure:
timestamp | name | value
0 | john | 5
1 | NULL | 3
8 | NULL | 12
12 | john | 3
33 | NULL | 4
54 | pete | 1
180 | NULL | 4
400 | john | 3
401 | NULL | 4
592 | anna | 2
Now what I am looking for is a query that will give me the sum of the values for each name, and treats the nulls in between (orderd by the timestamp) as the first non-null name down the list, as if the table were as follows:
timestamp | name | value
0 | john | 5
1 | john | 3
8 | john | 12
12 | john | 3
33 | pete | 4
54 | pete | 1
180 | john | 4
400 | john | 3
401 | anna | 4
592 | anna | 2
and I would query SUM(value), name from this table group by name. I have thought and tried, but I can't come up with a proper solution. I have looked at recursive common table expressions, and think the answer may lie in there, but I haven't been able to properly understand those.
These tables are just examples, and I don't know the timestamp values in advance.
Could someone give me a hand? Help would be very much appreciated.
With Inputs As
(
Select 0 As [timestamp], 'john' As Name, 5 As value
Union All Select 1, NULL, 3
Union All Select 8, NULL, 12
Union All Select 12, 'john', 3
Union All Select 33, NULL, 4
Union All Select 54, 'pete', 1
Union All Select 180, NULL, 4
Union All Select 400, 'john', 3
Union All Select 401, NULL, 4
Union All Select 592, 'anna', 2
)
, NamedInputs As
(
Select I.timestamp
, Coalesce (I.Name
, (
Select I3.Name
From Inputs As I3
Where I3.timestamp = (
Select Max(I2.timestamp)
From Inputs As I2
Where I2.timestamp < I.timestamp
And I2.Name Is not Null
)
)) As name
, I.value
From Inputs As I
)
Select NI.name, Sum(NI.Value) As Total
From NamedInputs As NI
Group By NI.name
Btw, what would be orders of magnitude faster than any query would be to first correct the data. I.e., update the name column to have the proper value, make it non-nullable and then run a simple Group By to get your totals.
Additional Solution
Select Coalesce(I.Name, I2.Name), Sum(I.value) As Total
From Inputs As I
Left Join (
Select I1.timestamp, MAX(I2.Timestamp) As LastNameTimestamp
From Inputs As I1
Left Join Inputs As I2
On I2.timestamp < I1.timestamp
And I2.Name Is Not Null
Group By I1.timestamp
) As Z
On Z.timestamp = I.timestamp
Left Join Inputs As I2
On I2.timestamp = Z.LastNameTimestamp
Group By Coalesce(I.Name, I2.Name)
You don't need CTE, just a simple subquery.
select t.timestamp, ISNULL(t.name, (
select top(1) i.name
from inputs i
where i.timestamp < t.timestamp
and i.name is not null
order by i.timestamp desc
)), t.value
from inputs t
And summing from here
select name, SUM(value) as totalValue
from
(
select t.timestamp, ISNULL(t.name, (
select top(1) i.name
from inputs i
where i.timestamp < t.timestamp
and i.name is not null
order by i.timestamp desc
)) as name, t.value
from inputs t
) N
group by name
I hope I'm not going to be embarassed by offering you this little recursive CTE query of mine as a solution to your problem.
;WITH
numbered_table AS (
SELECT
timestamp, name, value,
rownum = ROW_NUMBER() OVER (ORDER BY timestamp)
FROM your_table
),
filled_table AS (
SELECT
timestamp,
name,
value
FROM numbered_table
WHERE rownum = 1
UNION ALL
SELECT
nt.timestamp,
name = ISNULL(nt.name, ft.name),
nt.value
FROM numbered_table nt
INNER JOIN filled_table ft ON nt.rownum = ft.rownum + 1
)
SELECT *
FROM filled_table
/* or go ahead aggregating instead */