If I have following table in Postgres:
order_dtls
Order_id Order_date Customer_name
-------------------------------------
1 11/09/17 Xyz
2 15/09/17 Lmn
3 12/09/17 Xyz
4 18/09/17 Abc
5 15/09/17 Xyz
6 25/09/17 Lmn
7 19/09/17 Abc
I want to retrieve such customer who has placed orders on 2 consecutive days.
In above case Xyz and Abc customers should be returned by query as result.
There are many ways to do this. Use an EXISTS semi-join followed by DISTINCT or GROUP BY, should be among the fastest.
Postgres syntax:
SELECT DISTINCT customer_name
FROM order_dtls o
WHERE EXISTS (
SELEST 1 FROM order_dtls
WHERE customer_name = o.customer_name
AND order_date = o.order_date + 1 -- simple syntax for data type "date" in Postgres!
);
If the table is big, be sure to have an index on (customer_name, order_date) to make it fast - index items in this order.
To clarify, since Oto happened to post almost the same solution a bit faster:
DISTINCT is an SQL construct, a syntax element, not a function. Do not use parentheses like DISTINCT (customer_name). Would be short for DISTINCT ROW(customer_name) - a row constructor unrelated to DISTINCT - and just noise for the simple case with a single expression, because Postgres removes the pointless row wrapper for a single element automatically. But if you wrap more than one expression like that, you get an actual row type - an anonymous record actually, since no row type is given. Most certainly not what you want.
What is a row constructor used for?
Also, don't confuse DISTINCT with DISTINCT ON (expr, ...). See:
Select first row in each GROUP BY group?
Try something like...
SELECT `order_dtls`.*
FROM `order_dtls`
INNER JOIN `order_dtls` AS mirror
ON `order_dtls`.`Order_id` <> `mirror`.`Order_id`
AND `order_dtls`.`Customer_name` = `mirror`.`Customer_name`
AND DATEDIFF(`order_dtls`.`Order_date`, `mirror`.`Order_date`) = 1
The way I would think of it doing it would be to join the table the date part with itselft on the next date and joining it with the Customer_name too.
This way you can ensure that the same customer_name done an order on 2 consecutive days.
For MySQL:
SELECT distinct *
FROM order_dtls t1
INNER JOIN order_dtls t2 on
t1.Order_date = DATE_ADD(t2.Order_date, INTERVAL 1 DAY) and
t1.Customer_name = t2.Customer_name
The result you should also select it with the Distinct keyword to ensure the same customer is not displayed more than 1 time.
For postgresql:
select distinct(Customer_name) from your_table
where exists
(select 1 from your_table t1
where
Customer_name = your_table.Customer_name and Order_date = your_table.Order_date+1 )
Same for MySQL, just instead of your_table.Order_date+1 use: DATE_ADD(your_table.Order_date , INTERVAL 1 DAY)
This should work:
SELECT A.customer_name
FROM order_dtls A
INNER JOIN (SELECT customer_name, order_date FROM order_dtls) as B
ON(A.customer_name = B.customer_name and Datediff(B.Order_date, A.Order_date) =1)
group by A.customer_name
I have the following tables in SQL Server:
COMMANDLINES: ID_LINE - ID_COMMAND - ID_ARTICLE - QUANTITY
COMMAND: ID_COMMAND - ID_CLIENT - PRICE - PRINTED
CLIENT: ID_CLIENT - FULL_NAME - SSN - PH_NUM - MOBILE - USERNAME - PASSWORD
ARTICLE: ID_ARTICLE - DES - NAME - PRICE - TYPE - CURRENT_QTT - MINIMUM_QTT
ID_COMMAND from COMMANDLINES references COMMAND.ID_COMMAND
ID_CLIENT from COMMAND references CLIENT.ID_CLIENT
ID_ARTICLE from COMMANDLINES references ARTICLE.ID_ARTICLE
I need to create a view where I need to show all COMMANDLINES that have the best client (the one with the highest total of PRICE) and then I need to order them by ID_COMMAND in a descending order AND by ID_LINE in ascending order.
Sample data:
COMMANDLINE table:
COMMAND table:
Only these 2 are needed to resolve the problem. I added the other just for more information.
Sample output:
To be honest, I'm not sure if both outputs are supposed to be "output" at the same time or that I need 2 VIEWS for each output.
WHAT HAVE I DONE SO FAR:
I looked through what I could find on StackOverflow about MAX of SUM, but unfortunately, it has not helped me much in this case. I always seem to be doing something wrong.
I also found out that in order to use ORDER BY in VIEWS you need to, in this case, use TOP, but I've no idea how to apply it correctly when I need to select all of the COMMANDLINES. In one of my previous things, I used the following SELECT TOP:
create view PRODUCTS_BY_TYPE
as
select top (select count(*) from ARTICLE
where CURRENT_QTT > MINIMUM_QTT)*
from
ARTICLE
order by
TYPE
This allowed me to show all PRODUCT data where the CURRENT_QTT was more than the minimum ordering them by type, but I can't figure out for the life of me, how to apply this to my current situation.
I could start with something like this:
create view THE_BEST
as
select COMMANDLINE.*
from COMMANDLINE
But then I don't know how to apply the TOP.
I figured that first, I need to see who the best client is, by SUM-ing all of the PRICE under his ID and then doing a MAX on all of the SUM of all clients.
So far, the best I could come up with is this:
create view THE_BEST
as
select top (select count(*)
from (select max(max_price)
from (select sum(PRICE) as max_price
from COMMAND) COMMAND) COMMAND) COMMANDLINE.*
from COMMANDLINE
inner join COMMAND on COMMANDLINE.ID_COMMAND = COMMAND.ID_COMMAND
order by COMMAND.ID_COMMAND desc, COMMANDLINE.ID_LINE asc
Unfortunately, in the select count(*) the COMMAND is underlined in red (a.k.a. the 3rd COMMAND word) and it says that there is "no column specified for column 1 of COMMAND".
EDIT:
I've come up with something closer to what I want:
create view THE_BEST
as
select top (select count(*)
from (select max(total_price) as MaxPrice
from (select sum(PRICE) as total_price
from COMMAND) COMMAND) COMMAND)*
from COMMANDLINE
order by ID_LINE asc
Still missing the ordered by ID_COMMAND and I only get 1 result in the output when it should be 2.
here is some code that hopefully will show you how you can use the top-clause and also a different approche to show only the "top" :-)
/* Creating Tables*/
CREATE TABLE ARTICLE (ID_ARTICLE int,DES varchar(10),NAME varchar(10),PRICE float,TYPE int,CURRENT_QTT int,MINIMUM_QTT int)
CREATE TABLE COMMANDLINES (ID_LINE int,ID_COMMAND int,ID_ARTICLE int,QUANTITY int)
CREATE TABLE COMMAND (ID_COMMAND int, ID_CLIENT varchar(20), PRICE float, PRINTED int)
CREATE TABLE CLIENT (ID_CLIENT varchar(20), FULL_NAME varchar(50), SSN varchar(50), PH_NUM varchar(50), MOBILE varchar(50), USERNAME varchar(50), PASSWORD varchar(50))
INSERT INTO COMMANDLINES VALUES (1,1,10,20),(2,1,12,3),(3,1,2,21),(1,2,30,2),(2,2,21,5),(1,3,32,20),(2,3,21,2)
INSERT INTO COMMAND VALUES (1,'1695152D',1200,0),(2,'1695152D',500,0),(3,'2658492D',200,0)
INSERT INTO ARTICLE VALUES(1, 'A','AA',1300,0,10,5),(2,'B','BB',450,0,10,5),(30,'C','CC',1000,0,5,5),(21,'D','DD',1500,0,5,5),(32,'E','EE',1600,1,4,5),(3,'F','FF',210,2,15,5)
INSERT INTO CLIENT VALUES ('1695152D', 'DoombringerBG', 'A','123','321','asdf','asf'),('2658492D', 'tgr', 'A','123','321','asdf','asf')
GO
/* Your View-Problem*/
CREATE VIEW PRODUCTS_BY_TYPE AS
SELECT TOP 100 PERCENT *
FROM ARTICLE
WHERE CURRENT_QTT > MINIMUM_QTT -- You really don't want >= ??
ORDER BY [Type]
-- why do you need your view with an ordered output? cant your query order the data?
GO
OUTPUT:
ID_ARTICLE | DES | NAME | PRICE | TYPE | CURRENT_QTT | MINIMUM_QTT
-------------+-------+-------+-------+------+--------------+-------------
1 | A | AA | 1300 | 0 | 10 | 5
2 | B | BB | 450 | 0 | 10 | 5
3 | F | FF | 210 | 2 | 15 | 5
I hope this is what you were looking for :-)
-- your top customers
SELECT cli.FULL_NAME, SUM(c.PRICE)
FROM COMMANDLINES as cl
INNER JOIN COMMAND as c
on cl.ID_COMMAND = c.ID_COMMAND
INNER JOIN CLIENT as cli
on cli.ID_CLIENT = c.ID_CLIENT
GROUP BY cli.FULL_NAME
ORDER BY SUM(c.PRICE) DESC -- highest value first
SELECT *
FROM (
-- your top customers with a rank
SELECT cli.FULL_NAME, SUM(c.PRICE) as Price, ROW_NUMBER() OVER (ORDER BY SUM(c.PRICE) DESC) AS RowN
FROM COMMANDLINES as cl
INNER JOIN COMMAND as c
on cl.ID_COMMAND = c.ID_COMMAND
INNER JOIN CLIENT as cli
on cli.ID_CLIENT = c.ID_CLIENT
GROUP BY cli.FULL_NAME
) as a
-- only the best :-)
where RowN = 1
--considerations: what if two customers have the same value?
Output:
FULL_NAME |Price | RowN
----------------+---------+-------
DoombringerBG | 4600 | 1
Regards
tgr
===== EDITED =====
The syntax-corrention to your THE_BEST-View:
create view THE_BEST AS
SELECT TOP (
SELECT count(*) as cnt
FROM (
SELECT max(max_price) as max_price
FROM (
SELECT sum(PRICE) AS max_price
FROM COMMAND
) COMMAND
) COMMAND
)
cl.*
FROM COMMANDLINES as cl
INNER JOIN COMMAND as c
ON cl.ID_COMMAND = c.ID_COMMAND
ORDER BY c.ID_COMMAND DESC
,cl.ID_LINE ASC
Without the OVER-Clause:
SELECT TOP 1 *
FROM (
-- your top customers with a rank
SELECT cli.FULL_NAME, SUM(c.PRICE) as Price
FROM COMMANDLINES as cl
INNER JOIN COMMAND as c
on cl.ID_COMMAND = c.ID_COMMAND
INNER JOIN CLIENT as cli
on cli.ID_CLIENT = c.ID_CLIENT
GROUP BY cli.FULL_NAME
) as a
-- only the best :-)
ORDER BY Price DESC
Your PRODUCTS_BY_TYPE without PERCENT:
CREATE VIEW PRODUCTS_BY_TYPE AS
SELECT TOP (select
SUM(p.rows)
from sys.partitions as p
inner join sys.all_objects as ao
on p.object_id = ao.object_id
where ao.name = 'ARTICLE'
and ao.type = 'U')
*
FROM ARTICLE
WHERE CURRENT_QTT > MINIMUM_QTT -- You really don't want >= ??
ORDER BY [Type]
go
but to be honest - i would never use such a query in production... i only posted this because you need it for studing purposes...
It is quite likely that there is some misunderstanding between you and your teacher. You can technically have ORDER BY clause in a view definition, but it never guarantees any order of the rows in the query that uses the view, such as SELECT ... FROM your_view. Without ORDER BY in the final SELECT the order of the result set is not defined. The order of rows returned to the client by the server is determined only by the final outermost ORDER BY of the query, not by the ORDER BY in the view definition.
The purpose of having TOP in the view definition is to limit the number of returned rows somehow. For example, TOP (1). In this case ORDER BY specifies which row(s) to return.
Having TOP 100 PERCENT in a view does nothing. It doesn't reduce the number of returned rows and it doesn't guarantee any specific order of returned rows.
Having said all that, in your case you need to find one best client, so it makes sense to use TOP (1) in a sub-query.
This query would return the ID of the best client:
SELECT
TOP (1)
-- WITH TIES
ID_CLIENT
FROM COMMAND
GROUP BY ID_CLIENT
ORDER BY SUM(PRICE) DESC
If there can be several clients with the same maximum total price and you want to return data related to all of them, not just one random client, then use TOP WITH TIES.
Finally, you need to return lines that correspond to the chosen client(s):
create view THE_BEST
as
SELECT
COMMANDLINE.ID_LINE
,COMMANDLINE.ID_COMMAND
,COMMANDLINE.ID_ARTICLE
,COMMANDLINE.QUANTITY
FROM
COMMANDLINE
INNER JOIN COMMAND ON COMMAND.ID_COMMAND = COMMANDLINE.ID_COMMAND
WHERE
COMMAND.ID_CLIENT IN
(
SELECT
TOP (1)
-- WITH TIES
ID_CLIENT
FROM COMMAND
GROUP BY ID_CLIENT
ORDER BY SUM(PRICE) DESC
)
;
This is how the view can be used:
SELECT
ID_LINE
,ID_COMMAND
,ID_ARTICLE
,QUANTITY
FROM THE_BEST
ORDER BY ID_COMMAND DESC, ID_LINE ASC;
Note, that ORDER BY ID_COMMAND DESC, ID_LINE ASC has to be in the actual query, not in the view definition.
I am currently refreshing my knowledge in SQL and encountered some difficulties with the following query.
The requirements were:
For each maker, list in the alphabetical order with "/" as delimiter all the types of products he produces.
Deduce: maker, product types' list
The following solution actually works but I don't exactly understand how..
;with
t1 as
(select maker, type, DENSE_RANK() over(partition by maker order by type) rn
from product
),
tr(maker, type,lev) as
(select distinct t1.maker, cast(t1.type as nvarchar) , 2 from t1 where t1.rn = 1
union all
select t1.maker, cast(tr.type +'/'+t1.type as nvarchar), lev + 1
from t1 join tr on (t1.maker = tr.maker and t1.rn = tr.lev
)
)
select maker, max(type) names from tr group by maker
These output:
1 | A | Laptop/PC/Printer
2 | B | Laptop/PC
3 | C | Laptop
4 | D | Printer
5 | E | PC/Printer
*second column is the maker and third is the dynamically concatenated list of types.
Now, I'm a bit confused on how exactly does the lev grows dynamically.. Is there some kind of loop I'm missing here?
Why does it start with 2?
Why it doesn't work without the "cast"?
I would be very grateful if someone could explain the logic behind this query.
Thanks a lot!
What you're looking at is a recursive CTE,a CTE that calls itself. It's what you think is the "looping" effect. Recursion put a kink in my brain when I first looked at it. It helps to look at some examples and try creating a few simple ones of your own. I'm still not the best at them, but I'm getting better. I added some comments to your code. Hope this helps.
;WITH t1
AS (
SELECT maker,
type,
--Partition says each maker is a group so restart at 1
--Order by type is alphabetic
--DENSE_RANK() means if there are two the of the same item, they get the same number
DENSE_RANK() OVER (PARTITION BY maker ORDER BY type) rn
FROM product
),
--This is a recursive CTE meaning a CTE that calls itself(what is doing the "looping"
tr (maker,type,lev)
AS (
--Grab only the distinct items from your ranked table
SELECT DISTINCT t1.maker,
cast(t1.type AS NVARCHAR),
2 --This is the start of your recursive loop
FROM t1
WHERE t1.rn = 1
UNION ALL
--Recursively loop through you data, adding each string at the end with the '/'
SELECT t1.maker,
cast(tr.type + '/' + t1.type AS NVARCHAR),
--Plus one to grab the next value
lev + 1
FROM t1
INNER JOIN tr ON (
--Only match the same makers
t1.maker = tr.maker
--Match to the next value
AND t1.rn = tr.lev
)
)
--I think you know what this does
SELECT maker,
max(type) names
FROM tr
GROUP BY maker
Some Context: The DB is Oracle. I am trying to create one delimited string per row of a table. The delimited string needs to contain two of the item's categories (if available, there will always be one category at a minimum). There are 3 tables, ITEM, ITEM_CAT and ITEM_ITEM_CAT. The ITEM table holds all the items. The ITEM_CAT table holds all the possible item categories. The ITEM_ITEM_CAT holds all the mappings between the item IDs and the category IDs, a key value table essentially.
I have created the below SQL which is able to get the delimited string for a specific item, but I need a query which can run against the entire table.
SELECT 'ITEM'||'%#'|| outerTable.ITEM_ID ||'%#'||
(SELECT midTable.item_cat_nam
FROM
(SELECT innerTable.item_cat_nam AS item_cat_nam, innerTable.item_id AS item_id, ROWNUM AS rn
FROM
(SELECT ic.ITEM_CAT_NAM AS item_cat_nam, i.ITEM_ID AS item_id
FROM ITEM_CAT ic, ITEM_ITEM_CAT iic, ITEM i
WHERE i.ITEM_ID = iic.ITEM_ID
AND iic.ITEM_CAT_CD = ic.ITEM_CAT_CD
AND 287484 = i.item_id
) innerTable
) midTable
WHERE rn = 1
) ||'%#'||
(SELECT midTable.item_cat_nam
FROM
(SELECT innerTable.item_cat_nam AS item_cat_nam, innerTable.item_id AS item_id, ROWNUM AS rn
FROM
(SELECT ic.ITEM_CAT_NAM AS item_cat_nam, i.ITEM_ID AS item_id
FROM ITEM_CAT ic, ITEM_ITEM_CAT iic, ITEM i
WHERE i.ITEM_ID = iic.ITEM_ID
AND iic.ITEM_CAT_CD = ic.ITEM_CAT_CD
AND 287484 = i.item_id
) innerTable
) midTable
WHERE rn = 2
)
FROM OFR outerTable
WHERE outerTable.ITEM_ID = 287484;
I need to be able to pass the outer table's ITEM_ID down into the last inner join. I could do this when I only need the category (via the below SQL statement, only one inner join needed), but with the introduction of multiple categories; I need rownum (to get multiple categories) which then needs more inner joins and I can't seem to pass the ITEM_ID down more than one inner join, and here lies the problem...
SELECT 'ITEM'||'%#'|| outerTable.OFR_ID ||'%#'||
(SELECT ic.ITEM_CAT_NAM
FROM ITEM_CAT ic, ITEM_ITEM_CAT iic, ITEM i
WHERE i.ITEM_ID = iic.ITEM_ID
AND iic.ITEM_CAT_CD = ic.ITEM_CAT_CD
AND outerTable.OFR_ID = i.item_id
AND rownum = 1
) innerTable
FROM OFR outerTable;
Can anyone help with this?
Thank you in advance for any assitance.
No worries. You need something like this...
SELECT 'ITEM' || '%#' || Item_ID || '%#' || CatName1 || '%#' || CatName2
FROM outerTable
INNER JOIN (
SELECT
Item_ID,
MAX(CASE WHEN rn = 1 THEN Item_Cat_Nam ELSE NULL END) CatName1,
MAX(CASE WHEN rn = 2 THEN Item_Cat_Nam ELSE NULL END) CatName2
FROM (
SELECT
Item_ID,
Item_Cat.Item_Cat_Nam,
ROW_NUMBER() OVER (PARTITION BY Item_ID ORDER BY Item_ID) rn
FROM Item
INNER JOIN Item_Item_Cat USING (Item_ID)
INNER JOIN Item_Cat USING (Item_Cat_Cd)
) GROUP BY Item_ID
) USING (Item_ID)
The innermost query uses the ROW_NUMBER function to assign 1, 2, 3, etc. to every category found for each item. The PARTITION BY restarts the numbering at 1 for each item. The ORDER BY is required so I used Item_ID because hey, why not? If you have a preferred column to order by just use that - it will be used to assign the row numbers. The inner query will output something like this:
Item_ID Item_Cat_Nam rn
------- ------------ --
1 Category aa 1
1 Category xy 2
1 Category ef 3
2 Category xy 1
2 Category ax 2
3 Category ef 1
The query surrounding the innermost query uses MAX to flatten the first two rn values for each Item_ID into a single row. The Item_Cat_Nam for rn=1 goes to the CatName1 column and the Item_Cat_Nam for rn=2 goes to the CatName2 column. When it's fed the results shown above you'll end up with this:
Item_ID CatName1 CatName2
------- ----------- -----------
1 Category aa Category xy (note Category ef is rn=3 so it's ignored)
2 Category xy Category ax
3 Category ef (note only one row for Item_ID 3)
Then the very outer query just concatenates everything.
One other thing: I used the "JOIN ... USING" syntax because in this case it lets you eliminate all of the aliases (innerTable, i, ic, iic, midTable, etc.). That's purely because I'm more comfortable with it so it helped me figure this out a lot quicker. You should feel free to use your own join style - after all you'll be the one stuck maintaining it :)
Imagine the following schema and sample data (SQL Server 2008):
OriginatingObject
----------------------------------------------
ID
1
2
3
ValueSet
----------------------------------------------
ID OriginatingObjectID DateStamp
1 1 2009-05-21 10:41:43
2 1 2009-05-22 12:11:51
3 1 2009-05-22 12:13:25
4 2 2009-05-21 10:42:40
5 2 2009-05-20 02:21:34
6 1 2009-05-21 23:41:43
7 3 2009-05-26 14:56:01
Value
----------------------------------------------
ID ValueSetID Value
1 1 28
etc (a set of rows for each related ValueSet)
I need to obtain the ID of the most recent ValueSet record for each OriginatingObject. Do not assume that the higher the ID of a record, the more recent it is.
I am not sure how to use GROUP BY properly in order to make sure the set of results grouped together to form each aggregate row includes the ID of the row with the highest DateStamp value for that grouping. Do I need to use a subquery or is there a better way?
You can do it with a correlated subquery or using IN with multiple columns and a GROUP-BY.
Please note, simple GROUP-BY can only bring you to the list of OriginatingIDs and Timestamps. In order to pull the relevant ValueSet IDs, the cleanest solution is use a subquery.
Multiple-column IN with GROUP-BY (probably faster):
SELECT O.ID, V.ID
FROM Originating AS O, ValueSet AS V
WHERE O.ID = V.OriginatingID
AND
(V.OriginatingID, V.DateStamp) IN
(
SELECT OriginatingID, Max(DateStamp)
FROM ValueSet
GROUP BY OriginatingID
)
Correlated Subquery:
SELECT O.ID, V.ID
FROM Originating AS O, ValueSet AS V
WHERE O.ID = V.OriginatingID
AND
V.DateStamp =
(
SELECT Max(DateStamp)
FROM ValueSet V2
WHERE V2.OriginatingID = O.ID
)
SELECT OriginatingObjectID, id
FROM (
SELECT id, OriginatingObjectID, RANK() OVER(PARTITION BY OriginatingObjectID
ORDER BY DateStamp DESC) as ranking
FROM ValueSet)
WHERE ranking = 1;
This can be done with a correlated sub-query. No GROUP-BY necessary.
SELECT
vs.ID,
vs.OriginatingObjectID,
vs.DateStamp,
v.Value
FROM
ValueSet vs
INNER JOIN Value v ON v.ValueSetID = vs.ID
WHERE
NOT EXISTS (
SELECT 1
FROM ValueSet
WHERE OriginatingObjectID = vs.OriginatingObjectID
AND DateStamp > vs.DateStamp
)
This works only if there can not be two equal DateStamps for a OriginatingObjectID in the ValueSet table.