Inner Join + select the most recent - sql

I have been trying to do the bellow query but honestly it's driving me crazy.
I have 2 Tables on MS SQL CE 4.0
Table 1 Name: Items
ID
Item_Code
Logged_by
Description
ID | Item_Code | Logged_by | Description
1 | A | Pete | just an A
2 | B | Mary | Seams like a B
3 | C | Joe | Obviously this is a C
4 | D | Pete | This is another A
Table 2 Name: Item_Comments
ID
Item_Code
Comment
Date
ID | Item_Code | Comment | Date
1 | B | Done | 2014/08/08
2 | A | Nice A | 2014/08/08
3 | B | Send 1 More | 2014/08/09
4 | C | Done | 2014/08/10
5 | D | This is an A | 2014/08/10
6 | D | Opps Sorry | 2014/08/11
The wanted result: I'm looking to join the most recent comment from Item_Comments to the Items Table
ID | Item_Code | Logged_by | Description | Comment
1 | A | Pete | just an A | Nice A
2 | B | Mary | Seams like a B | Send 1 More
3 | C | Joe | Obviously this is a C | Done
4 | D | Pete | This is another A | Opps Sorry
I did this query but I'm getting all the information =( mixed.
SELECT *
FROM Items t1
JOIN
(SELECT Item_Code, Comment, MAX(date) as MyDate
FROM Item_Comments
Group By Item_Code, Comment, Date
) t2
ON Item_Code= Item_Code
ORDER BY t1.Item_Code;
Do you know any way to do this ?

Try:
select x.*, z.comment
from items x
join (select item_code, max(date) as latest_dt
from item_comments
group by item_code) y
on x.item_code = y.item_code
join item_comments z
on y.item_code = z.item_code
and y.latest_dt = z.date
Fiddle test: http://sqlfiddle.com/#!6/d387f/8/0
You were close with your query but in your inline view aliased as t2 you were grouping by comment, leaving the max function to not actually aggregate anything at all. In t2 you should have just selected item_code and max(date) and grouped only by item_code, then you can use that to join into item_comments (y and z in my query above).
This is a second way of doing this using a subquery, however I would stick to the above (a join w/ an inline view):
select i.*, c.comment
from items i
join item_comments c
on i.item_code = c.item_code
where c.date = (select max(x.date)
from item_comments x
where x.item_code = c.item_code)
order by i.id
Fiddle test: http://sqlfiddle.com/#!6/d387f/11/0

Note if you run this inside piece you get every single record:
SELECT Item_Code, Comment, MAX(date) as MyDate
FROM Item_Comments
Group By Item_Code, Comment, Date
You want only the most recent comment. Assuming this is SQL Server 2008 or earlier, this get's you the most recent date for each Item_Code:
SELECT Item_Code, MAX(date) as MyDate
FROM Item_Comments
Group By Item_Code
Now you need to join that back and look up the comment on that date:
SELECT C.*
FROM Item_Comments C
INNER JOIN
(SELECT Item_Code, MAX(date) as MyDate
FROM Item_Comments
Group By Item_Code
) t2
ON C.Item_Code= t2.Item_Code
AND C.date = t2.MyDate
Now you can use that to join back to your original table:
SELECT t1.*, LatestComment.*
FROM Items t1
INNER JOIN
(
SELECT C.*
FROM Item_Comments C
INNER JOIN
(SELECT Item_Code, MAX(date) as MyDate
FROM Item_Comments
Group By Item_Code
) t2
ON C.Item_Code= t2.Item_Code
AND C.date = t2.MyDate
) LatestComment
On LatestComment.Item_Code = t1.Item_Code
Depending on the actual database you are using, this can get much simpler. Thats why you need to tag your database and version.

Try this,
create table items (id int, item_code char(1), logged_by varchar(10), description varchar(30));
insert into items values (1, 'A', 'Pete', 'just an A');
insert into items values (2, 'B', 'Mary', 'Seams like a B');
insert into items values (3, 'C', 'Joe', 'Obviously this is a C');
insert into items values (4, 'D', 'Pete', 'This is another A');
create table item_comments (id int, item_code char(1), comment varchar(20), date date);
insert into item_comments values (1, 'B', 'Done', '2014/08/08');
insert into item_comments values (2, 'A', 'Nice A', '2014/08/08');
insert into item_comments values (3, 'B', 'Send 1 More', '2014/08/09');
insert into item_comments values (4, 'C', 'Done', '2014/08/10');
insert into item_comments values (5, 'D', 'This is an A', '2014/08/10');
insert into item_comments values (6, 'D', 'Opps Sorry', '2014/08/11');
select * from items;
select * from item_comments;
select * from (select i.logged_by,i.id,i.item_code,i.description,ic.comment
,row_number() over(partition by i.id order by i.id )as Rnk
from items i inner join item_comments ic
on i.item_code=ic.item_code and i.id in(1,3)) x
where x.Rnk=1
union
select * from (select i.logged_by,i.id,i.item_code,i.description,ic.comment
,row_number() over(partition by i.id order by i.id )as Rnk
from items i inner join item_comments ic
on i.item_code=ic.item_code and i.id in(2,4)
) x where x.Rnk=2 order by item_code

Related

PGSQL - Combining many AND + OR in WHERE clause

I have this table format:
| productid | price |
| ----------| -------------- |
| 1 | 10 |
| 2 | 20 |
| 3 | 30 |
| 4 | 40 |
Let's say I want to select all rows where:
(productid is 1 and price is over 50) OR
(productid is 2 and price is over 100) OR
(productid is 3 and price is over 20)
Is there a better generic way to achieve it (something with like arrays with indexes or something) other that do one at a time like:
select * from table where (productid = 1 and price > 50) OR
(productid = 2 and price > 100) OR
(productid = 3 and price > 20)
I would use a values clause:
select *
from the_table t
join (
values (1, 50),
(2, 100),
(3, 20)
) as p(productid, price)
on t.productid = p.productid
and t.price > p.price;
with conditions as (
select a[1]::int as product_id, a[2]::int as min_price
from (select regexp_split_to_array(unnest(string_to_array('1,50;2,100;3,20', ';')),',')) as dt(a)
)
select t.* from my_table t
inner join conditions c on t.product = c.product_id
and t.price >= c.min_price
test data:
drop table if exists my_table;
create temp table if not exists my_table(product int, price int, note text);
insert into my_table
select 1,10, 'some info 1:10' union all
select 1,20, 'some info 1:20' union all
select 1,50, 'some info 1:50' union all
select 2,20, 'some info 2:10' union all
select 2,100, 'some info 2:100:1' union all
select 2,100, 'some info 2:100:2' union all
select 3,30, 'some info 3:30' union all
select 4,40, 'some info 4:40';
result:
1 50 "some info 1:50"
2 100 "some info 2:100:1"
2 100 "some info 2:100:2"
3 30 "some info 3:30"
unnest(string_to_array('1,50;2,100;3,20', ';'))-- split CSV to rows
regexp_split_to_array(...., ',') -- split CSV to columns

Concatenate from rows in SQL server

I want to concatenate from multiple rows
Table:
|id |Attribute |Value |
|--------|------------|---------|
|101 |Manager |Rudolf |
|101 |Account |456 |
|101 |Code |B |
|102 |Manager |Anna |
|102 |Cardno |123 |
|102 |Code |B |
|102 |Code |C |
The result I’m looking for is:
|id |Manager|Account|Cardno|Code |
|--------|-------|-------|------|----------|
|101 |Rudolf |456 | |B |
|102 |Anna | |123 |B,C |
I have the following code from a related question:
select
p.*,
a.value as Manager,
b.value as Account,
c.value as Cardno
from table1 p
left join table2 a on a.id = p.id and a.attribute = 'Manager'
left join table2 b on b.id = p.id and b.attribute = 'Account'
left join table2 c on c.id = p.id and b.attribute = 'Cardno'
However, it fails for the Code attribute with ID# 102, where both B and C values are present.
How can I update this to include both of those values in the same result?
If you are using SQL SERVER 2017 or above then string_agg() with PIVOT() will be easy to use but much faster in performance solution (Query#1).
If you are using older version of SQL Server then go for Query#2 with STUFF() and XML PATH FOR() for concatenating value along with PIVOT()
Schema:
create table table1 (id int, Attribute varchar(50) , Value varchar(50));
insert into table1 values(101 ,'Manager' ,'Rudolf');
insert into table1 values(101 ,'Account' ,'456');
insert into table1 values(101 ,'Code' ,'B');
insert into table1 values(102 ,'Manager' ,'Anna');
insert into table1 values(102 ,'Cardno' ,'123');
insert into table1 values(102 ,'Code' ,'B');
insert into table1 values(102 ,'Code' ,'C');
GO
Query#1 PIVOT() with STRING_AGG():
select *
from
(
select t1.id,t1.attribute,
string_agg(value,',') AS value
from table1 t1
group by t1.id,t1.attribute
) d
pivot
(
max(value)
for attribute in (manager,account,cardno,code)
) piv
Output:
id
manager
account
cardno
code
101
Rudolf
456
<emnull</em
B
102
Anna
<emnull</em
123
B,C
Query#2 PIVOT() WITH STUFF() AND XML PATH FOR():
select *
from
(
select distinct t1.id,t1.attribute,
STUFF(
(SELECT ', ' + convert(varchar(10), t2.value, 120)
FROM table1 t2
where t1.id = t2.id and t1.attribute=t2.attribute
FOR XML PATH (''))
, 1, 1, '') AS value
from table1 t1
) d
pivot
(
max(value)
for attribute in (manager,account,cardno,code)
) piv
Output:
id
manager
account
cardno
code
101
Rudolf
456
<emnull</em
B
102
Anna
<emnull</em
123
B, C
db<fiddle here
Another method via XML and XQuery.
It is for SQL Server 2008 onwards.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT, attribute VARCHAR(20), [Value] VARCHAR(30));
INSERT INTO #tbl (ID, attribute, Value) VALUES
(101,'Manager','Rudolf'),
(101,'Account','456'),
(101,'Code','B'),
(102,'Manager','Anna'),
(102,'Cardno','123'),
(102,'Code','B'),
(102,'Code','C');
-- DDL and sample data population, end
;WITH rs AS
(
SELECT ID, (
SELECT *
FROM #tbl AS c
WHERE c.id = p.id
FOR XML PATH('r'), TYPE, ROOT('root')
) AS xmldata
FROM #tbl AS p
GROUP BY id
)
SELECT ID
, COALESCE(xmldata.value('(/root/r[attribute="Manager"]/Value/text())[1]','VARCHAR(30)'),'') AS Manager
, COALESCE(xmldata.value('(/root/r[attribute="Account"]/Value/text())[1]','VARCHAR(30)'),'') AS Account
, COALESCE(xmldata.value('(/root/r[attribute="Cardno"]/Value/text())[1]','VARCHAR(30)'),'') AS Cardno
, COALESCE(REPLACE(xmldata.query('data(/root/r[attribute="Code"]/Value)').value('.', 'VARCHAR(MAX)'), SPACE(1), ','),'') AS Code
FROM rs
ORDER BY ID;
Output
+-----+---------+---------+--------+------+
| ID | Manager | Account | Cardno | Code |
+-----+---------+---------+--------+------+
| 101 | Rudolf | 456 | | B |
| 102 | Anna | | 123 | B,C |
+-----+---------+---------+--------+------+
UPD: "STRING_AGG only Server 2017+"
You can solve this task using CTE and STRING_AGG function, for example:
declare
#t table (id int, Attribute varchar (100), [Value] varchar (100) )
insert into #t
values
(101, 'Manager', 'Rudolf'),
(101, 'Account', '456'),
(101, 'Code', 'B'),
(102, 'Manager', 'Anna'),
(102, 'Cardno', '123'),
(102, 'Code', 'B'),
(102, 'Code', 'C')
;with cte as
(
select id, Attribute
,STRING_AGG([Value], ', ') WITHIN GROUP (ORDER BY ID ASC) AS [Value]
from #t
group by ID, Attribute
)
select
max(p.ID) ID
,a.Value Manager
,isnull(b.Value, '') Account
,isnull(c.Value, '') Cardno
,isnull(e.Value, '') Code
from cte p
left join cte a on a.id =p.ID and a.attribute = 'Manager'
left join cte b on b.id = p.id and b.attribute = 'Account'
left join cte c on c.id = p.id and c.attribute = 'Cardno'
left join cte e on e.id = p.id and e.attribute = 'Code'
group by p.ID, a.Value,b.Value,c.Value,e.Value

Eliminating duplicate rows except one column with condition

I am having trouble trying to find an appropriate query(SQL-SERVER) for selecting records with condition however, the table I will be using has more than 100,000 rows and more than 20 columns.
So I need a code that satisfies the following condition:
1.)If [policy] and [plan] column is unique between rows then I will select that record
2.)If [policy] and [plan] return 2 or more rows then I will select the record which 'code' column isn't 999
3.)In some cases the unwanted rows may not have '999' in [code] column but may be other specifics
In other words, I would like to get row number 1,2,4,5,7.
Here is an example of what the table looks like
row #|policy|plan|code
-----------------------
1 | a | aa |111
-----------------------
2 | b | bb |112
-----------------------
3 | b | bb |999
-----------------------
4 | c | cc |111
-----------------------
5 | c | cc |112
-----------------------
6 | c | cc |999
-----------------------
7 | d | dd |999
-----------------------
I'm expecting to see something like
row #|policy|plan|code
-----------------------
1 | a | aa |111
-----------------------
2 | b | bb |112
-----------------------
4 | c | cc |111
-----------------------
5 | c | cc |112
-----------------------
7 | d | dd |999
-----------------------
Thank you in advance
This sounds like a prioritization query. You an use row_number():
select t.*
from (select t.*,
row_number() over (partition by policy, plan
order by code
) as seqnum
from t
) t
where seqnum = 1;
The expected output makes this a bit clearer:
select t.*
from (select t.*,
rank() over (partition by policy, plan
order by (case when code = 999 then 1 else 2 end) desc
) as seqnum
from t
) t
where seqnum = 1;
The OP wants all codes that are not 999 unless the only codes are 999. So, another approach is:
select t.*
from t
where t.code <> 999
union all
select t.*
from t
where t.code = 999 and
not exists (select 1
from t t2
where t2.policy = t.policy and t2.plan = t.plan and
t2.code <> 999
);
May be you want this (eliminate the last row if more than one)?
select t.*
from (select t.*
, row_number() over (partition by policy, plan
order by code desc
) AS RN
, COUNT(*) over (partition by policy, plan) AS RC
from t
) t
where RN > 1 OR RN=RC;
Output:
row policy plan code RN RC
1 1 a aa 111 1 1
2 2 b bb 112 2 2
3 5 c cc 112 2 3
4 4 c cc 111 3 3
5 7 d dd 999 1 1
CREATE TABLE #Table2
([row] int, [policy] varchar(1), [plan] varchar(2), [code] int)
;
INSERT INTO #Table2
([row], [policy], [plan], [code])
VALUES
(1, 'a', 'aa', 111),
(2, 'b', 'bb', 112),
(3, 'b', 'bb', 999),
(4, 'c', 'cc', 111),
(5, 'c', 'cc', 112),
(6, 'c', 'cc', 999),
(7, 'd', 'dd', 999)
;
with cte
as
(
select *,
row_number() over (partition by policy, [plan]
order by code
) as seqnum
from #Table2
)
select [row], [policy], [plan], [code] from cte where seqnum=1

Remove duplicate rows from joined table

I have following sql query
SELECT m.School, c.avgscore
FROM postswithratings c
join ZEntrycriteria on c.fk_postID= m.schoolcode
Which provide following result
School| avgscore
xyz | 5
xyz | 5
xyz | 5
abc | 3
abc | 3
kkk | 1
My question is how to remove those duplicates and get only following.
School| avgscore
xyz | 5
abc | 3
kkk | 1
I tried with
SELECT m.School, c.avgscore
FROM postswithratings c
join ZEntrycriteria on c.fk_postID= m.schoolcode
group by m.School
But it gives me following error
"Column 'postswithratings.avgscore' is invalid in the select list
because it is not contained in either an aggregate function or the
GROUP BY clause."
No need to make things complicated. Just go with:
SELECT m.School, c.avgscore
FROM postswithratings c
join ZEntrycriteria on c.fk_postID= m.schoolcode
group by m.School, c.avgscore
or
SELECT DISTINCT m.School, c.avgscore
FROM postswithratings c
join ZEntrycriteria on c.fk_postID= m.schoolcode
You have to only add distinct keyword like this :-
SELECT DISTINCT m.School, c.avgscore
FROM postswithratings c
join ZEntrycriteria on c.fk_postID= m.schoolcode
CREATE TABLE #Table2
([School] varchar(3), [avgscore] int)
INSERT INTO #Table2
([School], [avgscore])
VALUES
('xyz', 5),
('xyz', 5),
('xyz', 5),
('abc', 3),
('abc', 3),
('kkk', 1)
;
SELECT SCHOOL,AVGSCORE FROM (SELECT *,ROW_NUMBER() OVER( PARTITION BY [AVGSCORE] ORDER BY (SELECT NULL)) AS RN FROM #TABLE2)A
WHERE RN=1
ORDER BY AVGSCORE
-------
;WITH CTE AS
(SELECT *,ROW_NUMBER() OVER( PARTITION BY [AVGSCORE] ORDER BY (SELECT NULL)) AS RN FROM #TABLE2)
SELECT SCHOOL,AVGSCORE FROM CTE WHERE RN=1
output
SCHOOL AVGSCORE
kkk 1
abc 3
xyz 5
Using the DISTINCT keyword will make sql use sets instead of multisets. So values only appear once
This will delete the Duplicate rows (Only Duplicate)
Schema:
CREATE TABLE #TAB (School varchar(5) , avgscore int)
INSERT INTO #TAB
SELECT 'xyz', 5
UNION ALL
SELECT 'xyz', 5
UNION ALL
SELECT 'xyz', 5
UNION ALL
SELECT 'abc', 3
UNION ALL
SELECT 'abc', 3
UNION ALL
SELECT 'kkk', 1
Now use CTE as your Tempprary View and delete the data.
;WITH CTE AS(
SELECT ROW_NUMBER() OVER (PARTITION BY School,avgscore ORDER BY (SELECT 1)) DUP_C,
School, avgscore FROM #TAB
)
DELETE FROM CTE WHERE DUP_C>1
Now do check #TAB, the data will be
+--------+----------+
| School | avgscore |
+--------+----------+
| xyz | 5 |
| abc | 3 |
| kkk | 1 |
+--------+----------+
you only use group by if you're using aggregated function, eg. max. sum, avg
in that case,
SELECT Distinct(m.School), c.avgscore
FROM postswithratings c
join ZEntrycriteria on c.fk_postID= m.schoolcode

double sorted selection from a single table

I have a table with an id as the primary key, and a description as another field.
I want to first select the records that have the id<=4, sorted by description, then I want all the other records (id>4), sorted by description. Can't get there!
select id, descr
from t
order by
case when id <= 4 then 0 else 1 end,
descr
select *, id<=4 as low from table order by low, description
You may want to use an id <= 4 expression in your ORDER BY clause:
SELECT * FROM your_table ORDER BY id <= 4 DESC, description;
Test case (using MySQL):
CREATE TABLE your_table (id int, description varchar(50));
INSERT INTO your_table VALUES (1, 'c');
INSERT INTO your_table VALUES (2, 'a');
INSERT INTO your_table VALUES (3, 'z');
INSERT INTO your_table VALUES (4, 'b');
INSERT INTO your_table VALUES (5, 'g');
INSERT INTO your_table VALUES (6, 'o');
INSERT INTO your_table VALUES (7, 'c');
INSERT INTO your_table VALUES (8, 'p');
Result:
+------+-------------+
| id | description |
+------+-------------+
| 2 | a |
| 4 | b |
| 1 | c |
| 3 | z |
| 7 | c |
| 5 | g |
| 6 | o |
| 8 | p |
+------+-------------+
8 rows in set (0.00 sec)
Related post:
Using MySql, can I sort a column but have 0 come last?
select id, description
from MyTable
order by case when id <= 4 then 0 else 1 end, description
You can use UNION
SELECT * FROM (SELECT * FROM table1 WHERE id <=4 ORDER by description)aaa
UNION
SELECT * FROM (SELECT * FROM table1 WHERE id >4 ORDER by description)bbb
OR
SELECT * FROM table1
ORDER BY
CASE WHEN id <=4 THEN 0
ELSE 1
END, description