SQL Server query for merging 2 rows in 1 - sql

There is a SQL Server table (see the screenshot below) that I cannot change:
Products have identifiers and process parameters. There are 2 processes, A and B. Every process stores data in an own row.
I would like to get a table result without useless NULL values. One product on one row, no more. Desired cells are highlighted. See the 2nd screenshot for the desired output:

select a.id,
isnull(b.state, a.state) as state,
isnull(b.process, a.process) as process,
isnull(b.length, a.length) as length,
isnull(b.force, a.force) as force,
isnull(b.angle, a.angle) as angle
from table as a
left join table as b
on a.id = b.id
and b.process = 'B'
where a.process = 'A'
DECLARE #T AS TABLE (id int, state varchar(10), process varchar(10), length int, angle int
primary key (id, process));
insert into #t (id, state, process, length, angle) values
(111, 'OK', 'A', 77, null)
,(111, 'OK', 'B', null, 30)
,(159, 'NOK', 'A', 89, null)
,(147, 'OK', 'A', 78, null)
,(147, 'NOK', 'B', null, 36);
select ta.id, --ta.*, tb.*
isnull(tb.state, ta.state) as state,
isnull(tb.process, ta.process) as process,
isnull(tb.length, ta.length) as length,
isnull(tb.angle, ta.angle) as angle
from #t ta
left join #t tb
on ta.id = tb.id
and tb.process = 'B'
where ta.process = 'A'
order by ta.id

Self join? (edited)
so,
select
a.id,
a.process,
isnull(a.length, b.length) as length,
isnull(a.force, b.force) as force,
isnull(a.angle, b.angle) as angle
from
(select * from tmpdel where process = 'A') as a left join
(select * from tmpdel where process = 'B') as b on
a.id = b.id
I've given this a quick test and I think it looks right.

Related

Transpose data in SQL Server Select

I am wondering if there is a better way to write this query. It achieves the target result but my colleague would prefer it be written without the subselects into temp tables t1-t3. The main "challenge" here is transposing the data from dbo.ReviewsData into a single row along with the rest of the data joined from dbo.Prodcucts and dbo.Reviews.
CREATE TABLE dbo.Products (
idProduct int identity,
product_title varchar(100)
PRIMARY KEY (idProduct)
);
INSERT INTO dbo.Products VALUES
(1001, 'poptart'),
(1002, 'coat hanger'),
(1003, 'sunglasses');
CREATE TABLE dbo.Reviews (
Rev_IDReview int identity,
Rev_IDProduct int
PRIMARY KEY (Rev_IDReview)
FOREIGN KEY (Rev_IDProduct) REFERENCES dbo.Products(idProduct)
);
INSERT INTO dbo.Reviews VALUES
(456, 1001),
(457, 1002),
(458, 1003);
CREATE TABLE dbo.ReviewFields (
RF_IDField int identity,
RF_FieldName varchar(32),
PRIMARY KEY (RF_IDField)
);
INSERT INTO dbo.ReviewFields VALUES
(1, 'Customer Name'),
(2, 'Review Title'),
(3, 'Review Message');
CREATE TABLE dbo.ReviewData (
RD_idData int identity,
RD_IDReview int,
RD_IDField int,
RD_FieldContent varchar(100)
PRIMARY KEY (RD_idData)
FOREIGN KEY (RD_IDReview) REFERENCES dbo.Reviews(Rev_IDReview)
);
INSERT INTO dbo.ReviewData VALUES
(79, 456, 1, 'Daniel'),
(80, 456, 2, 'Love this item!'),
(81, 456, 3, 'Works well...blah blah'),
(82, 457, 1, 'Joe!'),
(84, 457, 2, 'Pure Trash'),
(85, 457, 3, 'It was literally a used banana peel'),
(86, 458, 1, 'Karen'),
(87, 458, 2, 'Could be better'),
(88, 458, 3, 'I can always find something wrong');
SELECT P.product_title as "item", t1.ReviewedBy, t2.ReviewTitle, t3.ReviewContent
FROM dbo.Reviews R
INNER JOIN dbo.Products P
ON P.idProduct = R.Rev_IDProduct
INNER JOIN (
SELECT D.RD_FieldContent AS "ReviewedBy", D.RD_IDReview
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 1
) t1
ON t1.RD_IDReview = R.Rev_IDReview
INNER JOIN (
SELECT D.RD_FieldContent AS "ReviewTitle", D.RD_IDReview
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 2
) t2
ON t2.RD_IDReview = R.Rev_IDReview
INNER JOIN (
SELECT D.RD_FieldContent AS "ReviewContent", D.RD_IDReview
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 3
) t3
ON t3.RD_IDReview = R.Rev_IDReview
EDIT: I have updated this post with the create statements for the tables as opposed to an image of the data (shame on me) and a more specific description of what exactly needed to be improved. Thanks to all for the comments and patience.
As others have said in comments, there is nothing objectively wrong with the query. However, you could argue that it's verbose and hard to read.
One way to shorten it is to replace INNER JOIN with CROSS APPLY:
INNER JOIN (
SELECT D.RD_FieldContent AS 'ReviewedBy', D.RD_IDReview
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 1
) t1
ON t1.RD_IDReview = R.Rev_IDReview
APPLY lets you refer to values from the outer query, like in a subquery:
CROSS APPLY (
SELECT D.RD_FieldContent AS 'ReviewedBy'
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 1 AND D.RD_IDReview = R.Rev_IDReview
) t1
I think of APPLY like a subquery that brings in new columns. It's like a cross between a subquery and a join. Benefits:
The query can be shorter, because you don't have to repeat the ID column twice.
You don't have to expose columns that you don't need.
Disadvantages:
If the query in the APPLY references outer values, then you can't extract it and run it all by itself without modifications.
APPLY is specific to Sql Server and it's not that widely-used.
Another thing to consider is using subqueries instead of joins for values that you only need in one place. Benefits:
The queries can be made shorter, because you don't have to repeat the ID column twice, and you don't have to give the output columns unique aliases.
You only have to look in one place to see the whole subquery.
Subqueries can only return 1 row, so you can't accidentally create extra rows, if only 1 row is desired.
SELECT
P.product_title as 'item',
(SELECT D.RD_FieldContent
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 1 AND
D.RD_IDReview = R.Rev_IDReview) as ReviewedBy,
(SELECT D.RD_FieldContent
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 2 AND
D.RD_IDReview = R.Rev_IDReview) as ReviewTitle,
(SELECT D.RD_FieldContent
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 3 AND
D.RD_IDReview = R.Rev_IDReview) as ReviewContent
FROM dbo.Reviews R
INNER JOIN dbo.Products P ON P.idProduct = R.Rev_IDProduct
Edit:
It just occurred to me that you have made the joins themselves unnecessarily verbose (#Dale K actually already pointed this out in the comments):
INNER JOIN (
SELECT D.RD_FieldContent AS 'ReviewedBy', D.RD_IDReview
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 1
) t1
ON t1.RD_IDReview = R.Rev_IDReview
Shorter:
SELECT RevBy.RD_FieldContent AS 'ReviewedBy'
...
INNER JOIN dbo.ReviewsData RevBy
ON RevBy.RD_IDReview = R.Rev_IDReview AND
RevBy.RD_IDField = 1
The originally submitted query is undoubtedly and unnecessarily verbose. Having digested various feedback from the community it has been revised to the following, working splendidly. In retrospect I feel very silly for having done this with subselects originally. I am clearly intermediate at best when it comes to SQL - I had not realized an "AND" clause could be included in the "ON" clause in a "JOIN" statement. Not sure why I would have made such a poor assumption.
SELECT
P.product_title as 'item',
D1.RD_FieldContent as 'ReviewedBy',
D2.RD_FieldContent as 'ReviewTitle',
D3.RD_FieldContent as 'ReviewContent'
FROM dbo.Reviews R
INNER JOIN dbo.Products P
ON P.idProduct = R.Rev_IDProduct
INNER JOIN dbo.ReviewsData D1
ON D1.RD_IDReview = R.Rev_IDReview AND D1.RD_IDField = 1
INNER JOIN dbo.ReviewsData D2
ON D2.RD_IDReview = R.Rev_IDReview AND D2.RD_IDField = 2
INNER JOIN dbo.ReviewsData D3
ON D3.RD_IDReview = R.Rev_IDReview AND D3.RD_IDField = 3

Set Based Table Filtering using Table-Valued Parameter

I'm looking to see if there is a Set Based way of filtering Table Data given a Table-Valued Parameter as an input to a UDF or SPROC.
The data table is defined as:
CREATE TABLE activity
(
id int identity primary key,
employeeId int NOT NULL,
stationId char(1) NOT NULL,
type int NOT NULL
);
The Table-Valued Parameter is defined as:
CREATE TYPE activityType AS TABLE(
stationId char(1) NOT NULL,
type int NOT NULL
);
Given the following Table data:
INSERT INTO activity
(employeeId, stationId, type)
VALUES
(100, 'A', 1), (100, 'B', 2), (100, 'C', 3),
(200, 'A', 1), (200, 'B', 2), (200, 'D', 1),
(300, 'A', 2), (300, 'C', 3), (300, 'D', 2);
I would like to be able to filter given a particular TVP from the UI.
Example 1: Find all employeeId who performed Activity 1 # Station A AND Activity 2 # Station B
DECLARE #activities activityType;
INSERT INTO #activities
VALUES('A', 1),('B', 2)
Expected Result from applying this TVP:
employeeId
-----------------
100
200
Example 2: Find all employeeId who performed Activity 1 # Station A, Activity 2 # Station B, AND Activity 3 # Station C
DECLARE #activities activityType;
INSERT INTO #activities
VALUES('A', 1),('B', 2),('C', 3);
Expected Result from applying this TVP:
employeeId
-----------------
100
I can apply this filter by looping over the TVP and intersecting the individually filtered results. However, I have the gut feeling there is a Set Based approach using CTEs or MERGE that I just can't wrap my head around at the moment.
Maybe not the perfect solution, but you could try something as next:
DECLARE #ExpectedActivities int
SELECT
#ExpectedActivities = COUNT(*)
FROM
#activities
SELECT
*
FROM
activity A
INNER JOIN
(
SELECT
NA.employeeId
FROM
activity NA
INNER JOIN #activities FA ON FA.stationId = NA.stationId
AND FA.type = NA.type
GROUP BY
NA.employeeId
HAVING
COUNT(*) >= #ExpectedActivities
) B ON A.employeeId = B.employeeId
This is a Relational Division with no Remainder problem. Dwain Camps has an article about this with a number of solutions:
High Performance Relational Division in SQL Server
SELECT a.employeeId
FROM (
SELECT DISTINCT employeeId, stationId, type
FROM activity
) a
INNER JOIN #activities at
ON a.stationId = at.stationId
AND a.type = at.type
GROUP BY
a.employeeId
HAVING
COUNT(*) = (SELECT COUNT(*) FROM #activities);
ONLINE DEMO
For scenario 1 - You could inner join with the table valued parameter itself. For the second scenario, you could do:
select distinct a.employeeId
from #activity as a
inner join #activities as b on b.stationId = a.stationId and b.type = a.type
For the second scenario:
select distinct act.employeeId
from #activity as act
where act.employeeId not in (
select distinct a.employeeId from #activity as a
left join #activities as b on b.stationId = a.stationId and b.type = a.type
where b.stationId is null)

insert with subquery and select

I would like to insert rows into a table using a select statement to query specific data but use data from a different table as part of the insert. EXmaple:
Table A: Clients where data is being queried and copied
Table B: MailOptOut where data is being inserted
I want to insert in MailOptOut table two values, a hardcoded field 'Summer Promotion' and then the acct# from the clients table (client.acct_no)
Here is my code that isn't working:
INSERT INTO PL00.DBO.mailcoptout (MC_NAME, ACCT_NO)
VALUES
('Summer Service Promo', client.acct_no),
('Referral Rewards Doubled', client.acct_no),
('Holiday Decorating 1', client.acct_no),
('Holiday Decorating 2', client.acct_no)
select client.acct_no, mailcoptout.* from plshared.dbo.client left join PL00.DBO.mailcoptout on mailcoptout.ACCT_NO = client.ACCT_NO
where client.U_SOLICIT = 'y'
and client.acct_no = '131335'
and client.INACTIVE <> 'y'
and mailcoptout.MC_NAME is null
Is this what you are trying to do?
INSERT INTO PL00.DBO.mailcoptout (MC_NAME, ACCT_NO)
select x.mc_name, client.acct_no
from plshared.dbo.client c left join
PL00.DBO.mailcoptout mc
on mailcoptout.ACCT_NO = client.ACCT_NO cross join
(select 'Summer Service Promo' as MC_NAME union all,
select 'Referral Rewards Doubled' as MC_NAME union all
select 'Holiday Decorating 1' as MC_NAME union all
select 'Holiday Decorating 2' as MC_NAME
) x
where c.U_SOLICIT = 'y' and
c.acct_no = '131335' and
c.INACTIVE <> 'y' and
mc.MC_NAME is null;

How to UPDATE pivoted table in SQL SERVER

I have flat table which I have to join using EAN attribute with my main table and update gid (id of my main table).
id attrib value gid
1 weight 10 NULL
1 ean 123123123112 NULL
1 color blue NULL
2 weight 5 NULL
2 ean 331231313123 NULL
I was trying to pivot ean rows into column, next join on ean both tables, and for this moment everything works great.
--update SideTable
--set gid = ab_id
select gid, ab_id
from SideTable
pivot (max (value) for attrib in ([EAN],[MPN])) as b
join MainTable as c
on c.ab_ean = b.EAN
where b.EAN !='' AND c.ab_archive = '0'
When I am selecting both id columns is okey, but when I am uncomment first lines and delete select whole table is set with first gid from my main table.
It have to set my main id into all attributes where ID where ean is matched from my main table.
I am sorry for my terrible english but I hope someone can help me, with that.
The reason your update does not work is that you don't have any link between your source and target for the update, although you reference sidetable in the FROM clause, this is effectively destroyed by the PIVOT function, leaving no link back to the instance of SideTable that you are updating. Since there is no link, all rows are updated with the same value, this will be the last value encountered in the FROM.
This can be demonstrated by running the following:
DECLARE #S TABLE (ID INT, Attrib VARCHAR(50), Value VARCHAR(50), gid INT);
INSERT #S
VALUES
(1, 'weight', '10', NULL), (1, 'ean', '123123123112', NULL), (1, 'color', 'blue', NULL),
(2, 'weight', '5', NULL), (2, 'ean', '331231313123', NULL);
SELECT s.*
FROM #S AS s
PIVOT (MAX(Value) FOR attrib IN ([EAN],[MPN])) AS pvt;
You clearly have a table aliased s in the FROM clause, however because you have used pivot you cannot use SELECT s*, you get the following error:
The column prefix 's' does not match with a table name or alias name used in the query.
You haven't provided sample data for your main table, but I am about 95% certain your PIVOT is not needed, I think you can get your update using just normal JOINs:
UPDATE s
SET gid = ab_id
FROM SideTable AS s
INNER JOIN SideTable AS ean
ON ean.ID = s.ID
AND ean.attrib = 'ean'
INNER JOIN MainTable AS m
ON m.ab_EAN = ean.Value
WHERE m.ab_archive = '0'
AND m.ab_EAN != '';
As per comment to the question, you need to use update + select statement.
A standard version looks like:
UPDATE
T
SET
T.col1 = OT.col1,
T.col2 = OT.col2
FROM
Some_Table T
INNER JOIN
Other_Table OT
ON
T.id = OT.id
WHERE
T.col3 = 'cool'
As to your needs:
update a
set a.gid = p.ab_id
from SideTable As a
Inner join (
select gid, ab_id
from SideTable
pivot (max (value) for attrib in ([EAN],[MPN])) as b
join MainTable as c
on c.ab_ean = b.EAN
where b.EAN !='' AND c.ab_archive = '0') p ON a.ean = p.EAN
try and break it down a bit more like this..
update SideTable
set SideTable.gid = p.ab_id
FROM
(
select gid, ab_id
from SideTable
pivot (max (value) for attrib in ([EAN],[MPN])) as b
join MainTable as c
on c.ab_ean = b.EAN
where b.EAN !='' AND c.ab_archive = '0'
) p
WHERE p.EAN = SideTable.EAN

Recursively sum the nodes of a tree using Postgresql WITH clause

(Using Postgresql 9.1)
I have a tree structure in the database and I need to sum the node's values. There are two caveats:
Not all nodes have a value.
If a parent node has a value, ignore the child values.
While recursing the tree is easy with the powerful recursive WITH clause, it's enforcing these two caveats that is breaking my code. Here's my setup:
CREATE TABLE node (
id VARCHAR(1) PRIMARY KEY
);
INSERT INTO node VALUES ('A');
INSERT INTO node VALUES ('B');
INSERT INTO node VALUES ('C');
INSERT INTO node VALUES ('D');
INSERT INTO node VALUES ('E');
INSERT INTO node VALUES ('F');
INSERT INTO node VALUES ('G');
CREATE TABLE node_value (
id VARCHAR(1) PRIMARY KEY,
value INTEGER
);
INSERT INTO node_value VALUES ('B', 5);
INSERT INTO node_value VALUES ('D', 2);
INSERT INTO node_value VALUES ('E', 0);
INSERT INTO node_value VALUES ('F', 3);
INSERT INTO node_value VALUES ('G', 4);
CREATE TABLE tree (
parent VARCHAR(1),
child VARCHAR(1)
);
INSERT INTO tree VALUES ('A', 'B');
INSERT INTO tree VALUES ('B', 'D');
INSERT INTO tree VALUES ('B', 'E');
INSERT INTO tree VALUES ('A', 'C');
INSERT INTO tree VALUES ('C', 'F');
INSERT INTO tree VALUES ('C', 'G');
This gives me the following tree (nodes and values):
A
|--B(5)
| |--D(2)
| |--E(0)
|
|--C
|--F(3)
|--G(4)
Given the rules above, here are the expected sum values:
A = (5 + 3 + 4) = 12
B = 5
D = 2
E = 0
C = (3 + 4) = 7
F = 3
G = 4
I have written the following SQL, but I can't integrate the recursive UNION and JOIN logic to enforce rule #1 and #2:
WITH recursive treeSum(root, parent, child, total_value) AS (
SELECT tree.parent root, tree.parent, tree.child, node_value.value total_value
FROM tree
LEFT JOIN node_value ON node_value.id = tree.parent
UNION
SELECT treeSum.root, tree.parent, tree.child, node_value.value total_value
FROM tree
INNER JOIN treeSum ON treeSum.child = tree.parent
LEFT JOIN node_value ON node_value.id = tree.parent
)
SELECT root, sum(total_value) FROM treeSum WHERE root = 'A' GROUP BY root
The query returns 10 for root A, but it should be 12. I know the UNION and/or JOIN logic is what's throwing this off. Any help would be appreciated.
EDIT: To clarify, the sum for A is 12, not 14. Given the rules, if a node has a value, grab that value and ignore its children. Because B has a value of 5 we ignore D and E. C has no value, so we grab its children, thus the sum of A = 5(B) + 3(F) + 4(G) = 12. I know it's odd but that's the requirement. Thanks.
EDIT 2: These results will be joined with external datasets so I can't hardcode the root in the WITH clause. For example, I might need something like this:
SELECT root, SUM(total_value) FROM treeSUM GROUP BY root WHERE root = 'A'
This tree is one of many so that means there's multiple roots, specified by calling code--not within the recursive clause itself. Thanks.
EDIT 3: An example of how this will be used in production is the roots will be specified by another table, so I can't hardcode the root into the recursive clause. There might be many roots from many trees.
SELECT id, SUM(COALESCE(value,0)) FROM treeSUM
INNER JOIN roots_to_select rts ON rts.id = treeSUM.id GROUP BY id
SOLUTION (Cleaned up from koriander's answer below)! The following allows roots to be specified by outside sources (either using roots_to_select or WHERE criteria:
WITH recursive roots_to_select AS (
SELECT 'A'::varchar as id
),
treeSum(root, id, value) AS (
select node.id as root, node.id, node_value.value
from node
inner join roots_to_select rts on (node.id = rts.id)
left join node_value on (node.id = node_value.id)
union
select treeSum.root, node.id, node_value.value
from treeSum
inner join tree on (treeSum.id = tree.parent)
inner join node on (tree.child = node.id)
left join node_value on (node.id = node_value.id)
where treeSum.value is null
)
select root, sum(coalesce(value, 0))
from treeSum
group by root
OUTPUT: 12
tested here:
with recursive treeSum(id, value) AS (
select node.id, node_value.value
from node
left join node_value on (node.id = node_value.id)
where node.id = 'A'
union
select node.id, node_value.value
from treeSum
inner join tree on (treeSum.id = tree.parent)
inner join node on (tree.child = node.id)
left join node_value on (node.id = node_value.id)
where treeSum.value is null
)
select sum(coalesce(value, 0)) from treeSum
Edit 1: to combine the result with other table, you can do:
select id, (select sum(coalesce(value, 0)) from treeSum) as nodesum
from node
inner join some_table on (...)
where node.id = 'A'
Edit 2: to support multiple roots based on your Edit 3, you can do (untested):
with recursive treeSum(root, id, value) AS (
select node.id as root, node.id, node_value.value
from node
inner join roots_to_select rts on (node.id = rts.id)
left join node_value on (node.id = node_value.id)
union
select treeSum.root, node.id, node_value.value
from treeSum
inner join tree on (treeSum.id = tree.parent)
inner join node on (tree.child = node.id)
left join node_value on (node.id = node_value.id)
where treeSum.value is null
)
select root, sum(coalesce(value, 0))
from treeSum
group by root