I have this pivoted table
+---------+----------+----------+-----+----------+
| Date | Product1 | Product2 | ... | ProductN |
+---------+----------+----------+-----+----------+
| 7/1/15 | 5 | 2 | ... | 7 |
| 8/1/15 | 7 | 1 | ... | 9 |
| 9/1/15 | NULL | 7 | ... | NULL |
| 10/1/15 | 8 | NULL | ... | NULL |
| 11/1/15 | NULL | NULL | ... | NULL |
+---------+----------+----------+-----+----------+
I wanted to fill in the NULL column with the values above them. So, the output should be something like this.
+---------+----------+----------+-----+----------+
| Date | Product1 | Product2 | ... | ProductN |
+---------+----------+----------+-----+----------+
| 7/1/15 | 5 | 2 | ... | 7 |
| 8/1/15 | 7 | 1 | ... | 9 |
| 9/1/15 | 7 | 7 | ... | 9 |
| 10/1/15 | 8 | 7 | ... | 9 |
| 11/1/15 | 8 | 7 | ... | 9 |
+---------+----------+----------+-----+----------+
I've found this article that might help me but this only manipulate one column. How do I apply this to all my column or how can I achieve such result since my columns are dynamic.
Any help would be much appreciated. Thanks!
The ANSI standard has the IGNORE NULLS option on LAG(). This is exactly what you want. Alas, SQL Server has not (yet?) implemented this feature.
So, you can do this in several ways. One is using multiple outer applys. Another uses correlated subqueries:
select p.date,
(case when p.product1 is not null else p.product1
else (select top 1 p2.product1 from pivoted p2 where p2.date < p.date order by p2.date desc)
end) as product1,
(case when p.product1 is not null else p.product1
else (select top 1 p2.product1 from pivoted p2 where p2.date < p.date order by p2.date desc)
end) as product1,
(case when p.product2 is not null else p.product2
else (select top 1 p2.product2 from pivoted p2 where p2.date < p.date order by p2.date desc)
end) as product2,
. . .
from pivoted p ;
I would recommend an index on date for this query.
I would like to suggest you a solution. If you have a table which consists of merely two columns my solution will work perfectly.
+---------+----------+
| Date | Product |
+---------+----------+
| 7/1/15 | 5 |
| 8/1/15 | 7 |
| 9/1/15 | NULL |
| 10/1/15 | 8 |
| 11/1/15 | NULL |
+---------+----------+
select x.[Date],
case
when x.[Product] is null
then min(c.[Product])
else
x.[Product]
end as Product
from
(
-- this subquery evaluates a minimum distance to the rows where Product column contains a value
select [Date],
[Product],
min(case when delta >= 0 then delta else null end) delta_min,
max(case when delta < 0 then delta else null end) delta_max
from
(
-- this subquery maps Product table to itself and evaluates the difference between the dates
select p.[Date],
p.[Product],
DATEDIFF(dd, p.[Date], pnn.[Date]) delta
from #products p
cross join (select * from #products where [Product] is not null) pnn
) x
group by [Date], [Product]
) x
left join #products c on x.[Date] =
case
when abs(delta_min) < abs(delta_max) then DATEADD(dd, -delta_min, c.[Date])
else DATEADD(dd, -delta_max, c.[Date])
end
group by x.[Date], x.[Product]
order by x.[Date]
In this query I mapped the table to itself rows which contain values by CROSS JOIN statement. Then I calculated differences between dates in order to pick the closest ones and thereafter fill empty cells with values.
Result:
+---------+----------+
| Date | Product |
+---------+----------+
| 7/1/15 | 5 |
| 8/1/15 | 7 |
| 9/1/15 | 7 |
| 10/1/15 | 8 |
| 11/1/15 | 8 |
+---------+----------+
Actually, the suggested query doesn't choose the previous value. Instead of this, it selects the closest value. In other words, my code can be used for a number of different purposes.
First You need to add identity column in temporary or hard table then resolved by following method.
--- Solution ----
Create Table #Test (ID Int Identity (1,1),[Date] Date , Product_1 INT )
Insert Into #Test ([Date], Product_1)
Values
('7/1/15',5)
,('8/1/15',7)
,('9/1/15',Null)
,('10/1/15',8)
,('11/1/15',Null)
Select ID , DATE ,
IIF ( Product_1 is null ,
(Select Product_1 from #TEST
Where ID = (Select Top 1 a.ID From #TEST a where a.Product_1 is not null and a.ID<b.ID
Order By a.ID desc)
),Product_1) Product_1
from #Test b
-- Solution End ---
Related
I wrote a pretty big sql query that joins (outer join) two similar queries. Each one of them returns a table in format:
date | value1(q1)
-----------+-----------
05-06-2010 | 10
05-07-2017 | 12
And the same for the second subquery. After i join them i get a following table:
date | value1(q1) | date | value(q2)
-----------+------------+------------+--------
05-06-2010 | 10 | NULL | NULL
05-07-2017 | 12 | NULL | NULL
NULL | NULL | 05-07-2010 | 15
NULL | NULL | 01-02-2008 | 17
I tried wrapping everything in a CONCAT, but it doesn't work.
How to get result in such a form:
date | value1(q1) | value(q2)
-----------+------------+-----------
05-06-2010 | 10 | 0
05-07-2017 | 12 | 10
07-08-2018 | 14 | 17
Try this below script-
SELECT [date],
SUM([value1(q1)]) AS 'value1(q1)',
SUM([value(q2)]) AS 'value(q2)'
FROM
(
SELECT [date],
[value1(q1)] AS 'value1(q1)',
0 AS 'value(q2)'
FROM your_table_1
UNION ALL
SELECT [date],
0 AS 'value1(q1)',
[value(q2)] AS 'value(q2)'
FROM your_table_2
)A
GROUP BY [date]
I think you want a full join:
select coalesce(q1.date, q2.date) as date,
coalesce(q1.value, 0) as value1,
coalesce(q2.value, 0) as value2
from q1 full join
q2
on q1.date = q2.date;
I have a table with old values (some null) and new values for various attributes, all inserted at different add times throughout the months. I'm trying to update a second table with records with business month end dates. Right now, these records only contain the most recent new values for all month end dates. The goal is to create historical data by updating the previous month end values with the old values from the first table. I am a beginner and was able to come up with a query to update on one object where there was one entry from the first table. Now I am trying to expand the query to include multiple objects, with possible, multiple old values within the same month. I tried to use "order by" (since I need to make updates for a month in ascending order so it gets the latest value) but read that doesn't work with update statements without a subquery. So I tried my hand at making a more complicated query, without success. I am getting the following error: single-row subquery returns more than one row. Thanks!
TableA:
| ID | TYPE | OLD_VALUE | NEW_VALUE | ADD_TIME|
-----------------------------------------------
| 1 | A | 2 | 3 | 1/11/2019 8:00:00am |
| 1 | B | 3 | 4 | 12/10/2018 8:00:00am|
| 1 | B | 4 | 5 | 12/11/2018 8:00:00am|
| 2 | A | 5 | 1 | 12/5/2018 08:00:00am|
| 2 | A | 1 | 2 | 12/5/2019 09:00:00am|
| 2 | A | 2 | 3 | 12/5/2019 10:00:00am|
| 2 | B | 1 | 2 | 12/5/2019 10:00:00am|
TableB
| ID | MONTH_END | TYPE_A | TYPE_B |
-----------------------------------
| 1 | 1/31/19 | 3 | 5 |
| 1 | 12/31/18 | 3 | 5 |
| 1 | 11/30/18 | 3 | 5 |
| 2 | 12/31/18 | 3 | 2 |
| 2 | 11/30/18 | 3 | 2 |
Desired Output for TableB
| ID | MONTH_END | TYPE_A | TYPE_B |
-----------------------------------
| 1 | 1/31/19 | 3 | 5 |
| 1 | 12/31/18 | 2 | 5 |
| 1 | 11/30/18 | 2 | 3 |
| 2 | 12/31/18 | 3 | 2 |
| 2 | 11/30/18 | 5 | 2 |
My Query for Type A (Which I plan to adapt for Type B and execute as well for the desired output)
update TableB B
set b.type_a =
(
with aa as
(
select id, nvl(old_value, new_value) typea, add_time
from TableA
where type = 'A'
order by id, add_time ascending
)
select typea
from aa
where aa.id = b.id
and b.month_end <= aa.add_tm
)
where exists
(
with aa as
(
select id, nvl(old_value, new_value) typea, add_time
from TableA
where type = 'A'
order by id, add_time ascending
)
select typea
from aa
where aa.id = b.id
and b.month_end <= aa.add_tm
)
Kudo's for giving example input data and desired output. I found your question a bit confusing so let me rephrase to "Provide the last type a value from table a that is in the same month as the month end.
By matching on type and date of entry, we can get your answer. The "ROWNUM=1" is to limit result set to a single entry in case there is more than one row with the same add_time. This SQL is still a mess, maybe someone else can come up with a better one.
UPDATE tableb b
SET b.typea =
(SELECT old_value
FROM tablea a
WHERE LAST_DAY( TRUNC( a.add_time ) ) = b.month_end
AND TYPE = 'A'
AND add_time =
(SELECT MAX( add_time )
FROM tablea
WHERE TYPE = 'A' AND LAST_DAY( TRUNC( a.add_time ) ) = b.month_end)
AND ROWNUM = 1)
WHERE EXISTS
(SELECT old_value
FROM tablea a
WHERE LAST_DAY( TRUNC( a.add_time ) ) = b.month_end AND TYPE = 'A');
I have a table that contains a lot of data, but the relevant data in the table looks something like this:
Orders table:
+----------+-----------+---------------+
| OrderID | Product | Date |
+----------+-----------+---------------+
| 1 | Apple | 01/01/2001 |
| 1 | Pear | 01/01/2001 |
| 1 | Pear | 01/01/2001 |
| 1 | Orange | 01/01/2001 |
| 1 | Pineapple | 01/01/2001 |
| 2 | Cherry | 02/02/2002 |
| 2 | Cherry | 02/02/2002 |
| 3 | Orange | 03/03/2003 |
| 3 | Apple | 03/03/2003 |
| 3 | Cherry | 03/03/2003 |
+----------+-----------+---------------+
I'd like a query to return a distinct list of orders, and if the order contains certain products, to indicate as such:
+----------+-----------+--------+-------+
| OrderID | Date | Apple? | Pear? |
+----------+-----------+--------+-------+
| 1 |01/01/2001 | X | X |
| 2 |02/02/2002 | | |
| 3 |03/03/2003 | X | |
+----------+-----------+--------+-------+
Here's where I've left off and decided to seek out help:
WITH CTEOrder AS
(
SELECT
OrderID, Product, Date,
ROW_NUMBER() OVER (PARTITION BY OrderID ORDER BY OrderID ASC) AS OrderRN
FROM
Orders
)
CTEApple as
(
SELECT
OrderID, Product, Date,
ROW_NUMBER() OVER (PARTITION BY OrderID ORDER BY OrderID ASC) AS AppleRN
FROM
Orders
WHERE
Product = 'Apple'
),
CTEPear
(
SELECT
OrderID, Product, Date,
ROW_NUMBER() OVER (PARTITION BY OrderID ORDER BY OrderID ASC) AS PearRN
FROM
Orders
WHERE
Product = 'Pear'
)
SELECT
o.OrderID, o.Product, o.Date,
co.OrderRN, a.AppleRN, p.PearRN
FROM
Orders AS o
OUTER JOIN
CTEOrder AS co ON o.OrderID = co.Orderid
OUTER JOIN
CTEApple AS a ON o.OrderID = a.OrderID
OUTER JOIN
CTEPear AS p ON o.OrderID = p.OrderID
WHERE
(co.OrderRN IS NULL AND a.AppleRN IS NULL AND p.PearRN IS NULL
OR co.OrderRN = 1 AND a.AppleRN IS NULL AND p.PearRN IS NULL
OR co.OrderRN = 1 AND a.AppleRN = 1 AND p.PearRN IS NULL
OR co.OrderRN = 1 AND a.AppleRN = 1 AND p.PearRN = 1
OR co.OrderRN = 1 AND a.AppleRN IS NULL AND p.PearRN = 1
OR co.OrderRN IS NULL AND a.AppleRN = 1 AND p.PearRN IS NULL
OR co.OrderRN IS NULL AND a.AppleRN = 1 AND p.PearRN = 1
OR co.OrderRN IS NULL AND a.AppleRN IS NULL AND p.PearRN = 1)
Currently my result set is unwieldy with a significant amount of duplication.
I'm thinking that I am heading in the wrong direction, but I don't know what other tools are available to me within SQL Server to cut up this data the way I need.
Thanks for any guidance!
Here's my result set after Nik Shenoy's guidance:
+----------+-----------+----------------+
| OrderID | Date | Apple? | Pear? |
+----------+-----------+----------------+
| 1 | 01/01/2001| x | NULL |
| 1 | 01/01/2001| NULL | x |
| 1 | 01/01/2001| NULL | x |
| 1 | 01/01/2001| NULL | NULL |
| 1 | 01/01/2001| NULL | NULL |
| 2 | 02/02/2002| NULL | NULL |
| 2 | 02/02/2002| NULL | NULL |
| 3 | 03/03/2003| NULL | NULL |
| 3 | 03/03/2003| x | NULL |
| 3 | 03/03/2003| NULL | NULL |
+----------+-----------+----------------+
What is my next step to have only 1 row per Order:
+----------+-----------+--------+-------+
| OrderID | Date | Apple? | Pear? |
+----------+-----------+--------+-------+
| 1 |01/01/2001 | X | X |
| 2 |02/02/2002 | | |
| 3 |03/03/2003 | X | |
+----------+-----------+--------+-------+
You can just use conditional aggregation:
select o.orderid, date,
max(case when product = 'Apple' then 'X' end) as IsApple,
max(case when product = 'Pear' then 'X' end) as IsPear
from orders o
group by o.orderid, date;
If you know all the products in advance, you can use the Transact-SQL PIVOT relational operator to cross-tabulate the data by product. If you use MAX or COUNT, you can just transform non-NULL or non-ZERO output to an 'x'
SELECT
PivotData.OrderID
, PivotData.OrderDate
, CASE WHEN PivotData.Apple IS NULL THEN '' ELSE 'X' END AS [Apple?]
, CASE WHEN PivotData.Pear IS NULL THEN '' ELSE 'X' END AS [Pear?]
, CASE WHEN PivotData.Orange IS NULL THEN '' ELSE 'X' END AS [Orange?]
, CASE WHEN PivotData.Pineapple IS NULL THEN '' ELSE 'X' END AS [Pineapple?]
, CASE WHEN PivotData.Cherry IS NULL THEN '' ELSE 'X' END AS [Cherry?]
FROM
(SELECT OrderID, Product, OrderDate) AS [Order]
PIVOT (MAX(Product) FOR Product IN ( [Apple], [Pear], [Orange], [Pineapple], [Cherry] )) AS PivotData
I have written a query which selects lets say 10 rows for this example.
+-----------+------------+
| STORENAME | COMPLAINTS |
+-----------+------------+
| Store1 | 4 |
| Store7 | 2 |
| Store8 | 1 |
| Store9 | 1 |
| Store2 | 1 |
| Store3 | 1 |
| Store4 | 1 |
| Store5 | 0 |
| Store6 | 0 |
| Store10 | 0 |
+-----------+------------+
How would I go about displaying the TOP 3 rows BUT Having the remaining rows roll up into a row called "other", and it adds all of their Complaints together?
So like this for example:
+-----------+------------+
| STORENAME | COMPLAINTS |
+-----------+------------+
| Store1 | 4 |
| Store7 | 2 |
| Store8 | 1 |
| Other | 4 |
+-----------+------------+
So what has happened above, is it displays the top3 then adds the complaints of the remaining rows into a row called other
I have exhausted all my resources and cannot find a solution. Please let me know if this makes sense.
I have created a SQLfiddle of the above tables that you can edit if it is possible :)
Here's hoping this is possible :)
Thanks,
Mike
Something like this may work
select *, row_number() over (order by complaints desc) as sno
into #temp
from
(
SELECT
a.StoreName
,COUNT(b.StoreID) AS [Complaints]
FROM Stores a
LEFT JOIN
(
SELECT
StoreName
,Complaint
,StoreID
FROM Complaints
WHERE Complaint = 'yes') b on b.StoreID = a.StoreID
GROUP BY a.StoreName
) as t ORDER BY [Complaints] DESC
select storename,complaints from #temp where sno<4
union all
select 'other',sum(complaints) as complaints from #temp where sno>=4
I do this with double aggregation and row_number():
select (case when seqnum <= 3 then storename else 'Other' end) as StoreName,
sum(numcomplaints) as numcomplaints
from (select c.storename, count(*) as numcomplaints,
row_number() over (order by count(*) desc) as seqnum
from complaints c
where c.complaint = 'Yes'
group by c.storename
) s
group by (case when seqnum <= 3 then storename else 'Other' end) ;
From what I can see, you don't really need any additional information from stores, so this version just leaves that table out.
I have a table like this to save the results of a medical checkup and the date of the report sent and the result. Actually the date sent is based on the clinic_visit date. A client can have one or more reports (date may varies)
---------------------------------------
| client_id | date_sent | result |
---------------------------------------
| 1 | 2001 | A |
| 1 | 2002 | B |
| 2 | 2002 | D |
| 3 | 2001 | A |
| 3 | 2003 | C |
| 3 | 2005 | E |
| 4 | 2002 | D |
| 4 | 2004 | E |
| 5 | 2004 | B |
---------------------------------------
I want to extract the following report from the above data.
---------------------------------------------------
| client_id | result1 | result2 | resut3 |
---------------------------------------------------
| 1 | A | B | |
| 2 | D | | |
| 3 | A | C | E |
| 4 | D | E | |
| 5 | B | | |
---------------------------------------------------
I'm working on Postgresql. the "crosstab" function won't work here because the "date_sent" is not consistent for each client.
Can anyone please give a rough idea how it should be queried?
I suggest the following approach:
SELECT client_id, array_agg(result) AS results
FROM labresults
GROUP BY client_id;
It's not exactly the same output format, but it will give you the same information much faster and cleaner.
If you want the results in separate columns, you can always do this:
SELECT client_id,
results[1] AS result1,
results[2] AS result2,
results[3] AS result3
FROM
(
SELECT client_id, array_agg(result) AS results
FROM labresults
GROUP BY client_id
) AS r
ORDER BY client_id;
although that will obviously introduce a hardcoded number of possible results.
While I was reading about "simulating row_number", I tried to figure out another way to do this.
SELECT client_id,
MAX( CASE seq WHEN 1 THEN result ELSE '' END ) AS result1,
MAX( CASE seq WHEN 2 THEN result ELSE '' END ) AS result2,
MAX( CASE seq WHEN 3 THEN result ELSE '' END ) AS result3,
MAX( CASE seq WHEN 4 THEN result ELSE '' END ) AS result4,
MAX( CASE seq WHEN 5 THEN result ELSE '' END ) AS result5
FROM ( SELECT p1.client_id,
p1.result,
( SELECT COUNT(*)
FROM labresults p2
WHERE p2.client_id = p1.client_id
AND p2.result <= p1.result )
FROM labresults p1
) D ( client_id, result, seq )
GROUP BY client_id;
but the query took 10 minutes (500,000 ms++). for 30,000 records. This is too long..