Order rows by how they appear in the table - sql

I have 2 tables that I'm joining:
featured_products
item_id
999
234
700
products
item_id price
234 10.00
700 15.50
999 5.30
I need to join both tables but keep the order in which they appear in the featured_products table.
SELECT *
FROM products p
INNER JOIN featured_products f ON p.item_id = f.item_id
Expected results:
item_id price
----------------
999 5.30
234 10.00
700 15.50
Is there a way to do this?

I get the comments. If order is important, you must add an ORDER BY clause. If there is no column that defines the order you want, it gets tedious. An order by item_id or price would be easy, but you have something custom. I don't understand the logic.
If you need to manually define the order, adding a column of values to define the order for use in a ORDER BY clause might be the way to go.
To force this order for just these 3 records, a case statement can be used.
ORDER BY CASE p.item_id WHEN 999 THEN 1 WHEN 234 THEN 2 ELSE 3 END
If you want another order, you will have to do a little more work. (We can help if you share the requirements.)
Any order you happen to see in results without an ORDER BY clause will likely be because an index used to provide the data was in that order. If the query goes parallel, there might be segments of stuff in order, not all of it.

Related

How do you retrieve the top two records within each grouping

In my table, I have data that looks like this:
CODE DATE PRICE
100 1/1/13 $500
100 2/1/13 $521
100 3/3/13 $530
100 5/9/13 $542
222 3/3/13 $20
350 1/1/13 $200
350 3/1/13 $225
Is it possible to create query to pull out the TWO most recent records by DATE? AND only if there are 2+ dates for a specific code. So the result would be:
CODE DATE PRICE
100 5/9/13 $542
100 3/3/13 $530
350 3/1/13 $225
350 1/1/13 $200
Bonus points if you can put both prices/dates on the same line, like this:
CODE OLD_DATE OLD_PRICE NEW_DATE NEW_PRICE
100 3/3/13 $530 5/9/13 $542
350 1/1/13 $200 3/1/13 $225
Thank you!!!
I managed to solve it with 5 sub-queries and 1 rollup query.
First we have a subquery that gives us the MAX date for each code.
Next, we do the same subquery, except we exclude our previous results.
We assume that your data is already rolled up and you won't have duplicate dates for the same code.
Next we bring in the appropriate Code / Price for the latest and 2nd latest date. If a code doesn't exist in the 2nd Max query - then we don't include it at all.
In the union query we're combining the results of both. In the Rollup Query, we're sorting and removing null values generated in the union.
Results:
CODE MaxOfOLDDATE MaxOfOLDPRICE MaxOfNEWDATE MaxOfNEWPRICE
100 2013-03-03 $530.00 2013-05-09 542
350 2013-01-01 $200.00 2013-03-01 225
Using your Data in a table called "Table", create the following queries:
SUB_2ndMaxDatesPerCode:
SELECT Table.CODE, Max(Table.Date) AS MaxOfDATE1
FROM SUB_MaxDatesPerCode RIGHT JOIN [Table] ON (SUB_MaxDatesPerCode.MaxOfDATE = Table.DATE) AND (SUB_MaxDatesPerCode.CODE = Table.CODE)
GROUP BY Table.CODE, SUB_MaxDatesPerCode.CODE
HAVING (((SUB_MaxDatesPerCode.CODE) Is Null));
SUB_MaxDatesPerCode:
SELECT Table.CODE, Max(Table.Date) AS MaxOfDATE
FROM [Table]
GROUP BY Table.CODE;
SUB_2ndMaxData:
SELECT Table.CODE, Table.Date, Table.PRICE
FROM [Table] INNER JOIN SUB_2ndMaxDatesPerCode ON (Table.DATE = SUB_2ndMaxDatesPerCode.MaxOfDATE1) AND (Table.CODE = SUB_2ndMaxDatesPerCode.Table.CODE);
SUB_MaxData:
SELECT Table.CODE, Table.Date, Table.PRICE
FROM ([Table] INNER JOIN SUB_MaxDatesPerCode ON (Table.DATE = SUB_MaxDatesPerCode.MaxOfDATE) AND (Table.CODE = SUB_MaxDatesPerCode.CODE)) INNER JOIN SUB_2ndMaxDatesPerCode ON Table.CODE = SUB_2ndMaxDatesPerCode.Table.CODE;
SUB_Data:
SELECT CODE, DATE AS OLDDATE, PRICE AS OLDPRICE, NULL AS NEWDATE, NULL AS NEWPRICE FROM SUB_2ndMaxData;
UNION ALL SELECT CODE, NULL AS OLDDATE, NULL AS OLDPRICE, DATE AS NEWDATE, PRICE AS NEWPRICE FROM SUB_MaxData;
Data (Rollup):
SELECT SUB_Data.CODE, Max(SUB_Data.OLDDATE) AS MaxOfOLDDATE, Max(SUB_Data.OLDPRICE) AS MaxOfOLDPRICE, Max(SUB_Data.NEWDATE) AS MaxOfNEWDATE, Max(SUB_Data.NEWPRICE) AS MaxOfNEWPRICE
FROM SUB_Data
GROUP BY SUB_Data.CODE
ORDER BY SUB_Data.CODE;
There you go - thanks for the challenge.
Accessing the recent data
To access the recent data, you use TOP 2. Such as you inverse the data from the table, then select the top 2. Just as you start ABC from ZYX and select the TOP 2 which would provide you with ZY.
SELECT TOP 2 * FROM table_name ORDER BY column_time DESC;
This way, you reverse the table, and then select the most recent two from the top.
Joining the Tables
To join the two columns and create a result from there quest you can use JOIN (INNER JOIN; I prefer this) such as:
SELECT TOP 2 * FROM table_name INNER JOIN table_name.column_name ON
table_name.column_name2
This way, you will join both the tables where a value in one column matches the value from the other column in both tables.
You can use a for loop for this to select the value for them, or you can use this inside the foreach loop to take out the values for them.
My suggestion
My best method would be to, first just select the data that was ordered using the date.
Then inside the foreach() loop where you will write the data for that select the remaining data for that time. And write it inside that loop.
Code (column_name) won't bother you
And when you will reference the query using ORDER By Time Desc you won't be using the CODE anymore such as WHERE Code = value. And you will get the code for the most recent ones. If you really need the code column, you can filter it out using and if else block.
Reference:
http://technet.microsoft.com/en-us/library/ms190014(v=sql.105).aspx (Inner join)
http://www.w3schools.com/sql/sql_func_first.asp (top; check the Sql Server query)

Having problems fully understanding GROUP BY

I'm going over some practise questions for an exam that I have coming up and I'm having a problem fully understanding group by. I see GROUP BY as the following: group the result set by one or more columns.
I have the following database schema
My query
SELECT orders.customer_numb, sum(order_lines.cost_line), customers.customer_first_name, customers.customer_last_name
FROM orders
INNER JOIN customers ON customers.customer_numb = orders.customer_numb
INNER JOIN order_lines ON order_lines.order_numb = orders.order_numb
GROUP BY orders.customer_numb, order_lines.cost_line, customers.customer_first_name, customers.customer_last_name
ORDER BY order_lines.cost_line DESC
What I'm struggling to understand
Why can't I simply use just GROUP BY orders.cost_line and group the data by cost_line?
What I'm trying to achieve
I'd like to achieve the name of the customer who has spent the most money. I just don't fully understand how to achieve this. I understand how joins work, I just can't seem to get my head around why I can't simply GROUP BY customer_numb and cost_line (with sum() used to calculate the amount spent). I seem to always get "not a GROUP BY expression", if someone could explain what I'm doing wrong (not just give me the answer), that would be great - I'd really appreciate that, and of course any resources that you have for using GROUP by properly.
Sorry for the long essay and If I've missed anything I apologise. Any help would be greatly appreciated.
I just can't seem to get my head around why I can't simply GROUP BY
customer_numb and cost_line (with sum() used to calculate the amount
spent).
When you say group by customer_numb you know that customer_numb uniquely identifies a row in the customer table (assuming customer_numb is either a primary or alternate key), so that any given customers.customer_numb will have one and only one value for customers.customer_first_name and customers.customer_last_name. But at parse time Oracle does not know, or at least acts like it does not know that. And it says, in a bit of panic, "What do I do if a single customer_numb has more than one value for customer_first_name?"
Roughly the rule is, expressions in the select clause can use expressions in the group by clause and/or use aggregate functions. (As well as constants and system variables that don't depend on the base tables, etc.) And by "use" I mean be the expression or part of the expression. So once you group on first name and last name, customer_first_name || customer_last_name would be a valid expression also.
When you have a table, like customers and are grouping by a primary key, or a column with a unique key and not null constraint, you can safely include them in group by clause. In this particular instance, group by customer.customer_numb, customer.customer_first_name, customer.customer_last_name.
Also note, that the order by in the first query will fail, since order_lines.cost_line doesn't have a single value for the group. You can order on sum(order_lines.cost_line) or use an column alias in the select clause and order on that alias
SELECT orders.customer_numb,
sum(order_lines.cost_line),
customers.customer_first_name,
customers.customer_last_name
FROM orders
INNER JOIN customers ON customers.customer_numb = orders.customer_numb
INNER JOIN order_lines ON order_lines.order_numb = orders.order_numb
GROUP BY orders.customer_numb,
customers.customer_first_name,
customers.customer_last_name
ORDER BY sum(order_lines.cost_line)
or
SELECT orders.customer_numb,
sum(order_lines.cost_line) as sum_cost_line,
. . .
ORDER BY sum_cost_line
Note: I've heard that some RDBMSes will imply additional expressions for the grouping without them being explicitly stated. Oracle is not one of those RDBMSes.
As for grouping by both customer_numb and cost_line Consider a DB with two customers, 1 and 2 with two orders of one line each:
Customer Number | Cost Line
1 | 20.00
1 | 20.00
2 | 35.00
2 | 30.00
select customer_number, cost_line, sum(cost_line)
FROM ...
group by customer_number, cost_line
order by sum(cost_line) desc
Customer Number | Cost Line | sum(cost_line)
1 | 20.00 | 40.00
2 | 35.00 | 35.00
2 | 30.00 | 30.00
The first row with highest sum(cost_line) is not the customer who spent the most.
I understand how joins work, I just can't seem to get my head around
why I can't simply GROUP BY customer_numb and cost_line (with sum()
used to calculate the amount spent).
This should give you the sum for every customer.
SELECT orders.customer_numb, sum(order_lines.cost_line)
FROM orders
INNER JOIN order_lines ON order_lines.order_numb = orders.order_numb
GROUP BY orders.customer_numb
Note that every column in the SELECT clause that's not an argument to an aggregate function is also a column in the GROUP BY clause.
Now you can join that with other tables to get more detail. Here's one way using a common table expression. (There are other ways to express what you want.)
with customer_sums as (
-- We give the columns useful aliases here.
SELECT orders.customer_numb as customer_numb,
sum(order_lines.cost_line) as total_orders
FROM orders
INNER JOIN order_lines ON order_lines.order_numb = orders.order_numb
GROUP BY orders.customer_numb
)
select c.customer_numb, c.customer_first_name, c.customer_last_name, cs.total_orders
from customers c
inner join customer_sums cs
on cs.customer_numb = c.customer_numb
order by cs.total_orders desc
Why can't I simply use just GROUP BY orders.cost_line and group the
data by cost_line?
Applying GROUP BY to order_lines.cost_line will give you one row for each distinct value in order_lines.cost_line. (The column orders.cost_line doesn't exist.) Here's what that data might look like.
OL.ORDER_NUMB OL.COST_LINE O.CUSTOMER_NUMB C.CUSTOMER_FIRST_NAME C.CUSTOMER_LAST_NAME
--
1 1.45 2014 Julio Savell
1 2.33 2014 Julio Savell
1 1.45 2014 Julio Savell
2 1.45 2014 Julio Savell
2 1.45 2014 Julio Savell
3 13.00 2014 Julio Savell
You can group by order_lines.cost_line, but it won't give you any useful information. This query
select order_lines.cost_line, orders.customer_numb
from order_lines
inner join orders on orders.customer_numb = order_lines.customer_numb
group by order_lines.cost_line;
should return something like this.
OL.COST_LINE O.CUSTOMER_NUMB
--
1.45 2014
2.33 2014
13.00 2014
Not terribly useful.
If you're interested in the sum of the order line items, you need to decide what column or columns to group (summarize) by. If you group (summarize) by order number, you'll get three rows. If you group (summarize) by customer number, you'll get one row.

How to group by a column

Hi I know how to use the group by clause for sql. I am not sure how to explain this so Ill draw some charts. Here is my original data:
Name Location
----------------------
user1 1
user1 9
user1 3
user2 1
user2 10
user3 97
Here is the output I need
Name Location
----------------------
user1 1
9
3
user2 1
10
user3 97
Is this even possible?
The normal method for this is to handle it in the presentation layer, not the database layer.
Reasons:
The Name field is a property of that data row
If you leave the Name out, how do you know what Location goes with which name?
You are implicitly relying on the order of the data, which in SQL is a very bad practice (since there is no inherent ordering to the returned data)
Any solution will need to involve a cursor or a loop, which is not what SQL is optimized for - it likes working in SETS not on individual rows
Hope this helps
SELECT A.FINAL_NAME, A.LOCATION
FROM (SELECT DISTINCT DECODE((LAG(YT.NAME, 1) OVER(ORDER BY YT.NAME)),
YT.NAME,
NULL,
YT.NAME) AS FINAL_NAME,
YT.NAME,
YT.LOCATION
FROM YOUR_TABLE_7 YT) A
As Jirka correctly pointed out, I was using the Outer select, distinct and raw Name unnecessarily. My mistake was that as I used DISTINCT , I got the resulted sorted like
1 1
2 user2 1
3 user3 97
4 user1 1
5 3
6 9
7 10
I wanted to avoid output like this.
Hence I added the raw id and outer select
However , removing the DISTINCT solves the problem.
Hence only this much is enough
SELECT DECODE((LAG(YT.NAME, 1) OVER(ORDER BY YT.NAME)),
YT.NAME,
NULL,
YT.NAME) AS FINAL_NAME,
YT.LOCATION
FROM SO_BUFFER_TABLE_7 YT
Thanks Jirka
If you're using straight SQL*Plus to make your report (don't laugh, you can do some pretty cool stuff with it), you can do this with the BREAK command:
SQL> break on name
SQL> WITH q AS (
SELECT 'user1' NAME, 1 LOCATION FROM dual
UNION ALL
SELECT 'user1', 9 FROM dual
UNION ALL
SELECT 'user1', 3 FROM dual
UNION ALL
SELECT 'user2', 1 FROM dual
UNION ALL
SELECT 'user2', 10 FROM dual
UNION ALL
SELECT 'user3', 97 FROM dual
)
SELECT NAME,LOCATION
FROM q
ORDER BY name;
NAME LOCATION
----- ----------
user1 1
9
3
user2 1
10
user3 97
6 rows selected.
SQL>
I cannot but agree with the other commenters that this kind of problem does not look like it should ever be solved using SQL, but let us face it anyway.
SELECT
CASE main.name WHERE preceding_id IS NULL THEN main.name ELSE null END,
main.location
FROM mytable main LEFT JOIN mytable preceding
ON main.name = preceding.name AND MIN(preceding.id) < main.id
GROUP BY main.id, main.name, main.location, preceding.name
ORDER BY main.id
The GROUP BY clause is not responsible for the grouping job, at least not directly. In the first approximation, an outer join to the same table (LEFT JOIN below) can be used to determine on which row a particular value occurs for the first time. This is what we are after. This assumes that there are some unique id values that make it possible to arbitrarily order all the records. (The ORDER BY clause does NOT do this; it orders the output, not the input of the whole computation, but it is still necessary to make sure that the output is presented correctly, because the remaining SQL does not imply any particular order of processing.)
As you can see, there is still a GROUP BY clause in the SQL, but with a perhaps unexpected purpose. Its job is to "undo" a side effect of the LEFT JOIN, which is duplication of all main records that have many "preceding" ( = successfully joined) records.
This is quite normal with GROUP BY. The typical effect of a GROUP BY clause is a reduction of the number of records; and impossibility to query or test columns NOT listed in the GROUP BY clause, except through aggregate functions like COUNT, MIN, MAX, or SUM. This is because these columns really represent "groups of values" due to the GROUP BY, not just specific values.
If you are using SQL*Plus, use the BREAK function. In this case, break on NAME.
If you are using another reporting tool, you may be able to compare the "name" field to the previous record and suppress printing when they are equal.
If you use GROUP BY, output rows are sorted according to the GROUP BY columns as if you had an ORDER BY for the same columns. To avoid the overhead of sorting that GROUP BY produces, add ORDER BY NULL:
SELECT a, COUNT(b) FROM test_table GROUP BY a ORDER BY NULL;
Relying on implicit GROUP BY sorting in MySQL 5.6 is deprecated. To achieve a specific sort order of grouped results, it is preferable to use an explicit ORDER BY clause. GROUP BY sorting is a MySQL extension that may change in a future release; for example, to make it possible for the optimizer to order groupings in whatever manner it deems most efficient and to avoid the sorting overhead.
For full information - http://academy.comingweek.com/sql-groupby-clause/
SQL GROUP BY STATEMENT
SQL GROUP BY clause is used in collaboration with the SELECT statement to arrange identical data into groups.
Syntax:
1. SELECT column_nm, aggregate_function(column_nm) FROM table_nm WHERE column_nm operator value GROUP BY column_nm;
Example :
To understand the GROUP BY clauserefer the sample database.Below table showing fields from “order” table:
1. |EMPORD_ID|employee1ID|customerID|shippers_ID|
Below table showing fields from “shipper” table:
1. | shippers_ID| shippers_Name |
Below table showing fields from “table_emp1” table:
1. | employee1ID| first1_nm | last1_nm |
Example :
To find the number of orders sent by each shipper.
1. SELECT shipper.shippers_Name, COUNT (orders.EMPORD_ID) AS No_of_orders FROM orders LEFT JOIN shipper ON orders.shippers_ID = shipper.shippers_ID GROUP BY shippers_Name;
1. | shippers_Name | No_of_orders |
Example :
To use GROUP BY statement on more than one column.
1. SELECT shipper.shippers_Name, table_emp1.last1_nm, COUNT (orders.EMPORD_ID) AS No_of_orders FROM ((orders INNER JOIN shipper ON orders.shippers_ID=shipper.shippers_ID) INNER JOIN table_emp1 ON orders.employee1ID = table_emp1.employee1ID)
2. GROUP BY shippers_Name,last1_nm;
| shippers_Name | last1_nm |No_of_orders |
for more clarification refer my link
http://academy.comingweek.com/sql-groupby-clause/

SQL query for child table summary and generalazation

I have 4 tables with diagram below
I want to summary query for the Institution table. where I want to get result of only,
InstitutionType ProductName Quantity
For example. sample data of institution table
Id Name Address InstitionTypeId
1 aaa ny132 1001
2 bbb dx23 1001
3 ccc bn33 1002
And the InstitionProduct is like that
Id ProductId Quantity InstitionId
1 1000 120 1
2 1000 100 2
3 1000 50 3
Then I want a query result to output total quantity of a given product by Instition Type wise. The sample output will look like this.
InstitutionTypeId productId quantity
1001 1000 220
1002 1000 50
So I want to group the institution by type and aggregate the product quantity of all institution type group.
I tried to use the group by clause, but with the product quantity not as a grouping element it results in error.
SELECT
Institution.InstitutionTypeID,
InstitutionProduct.ProductID,
SUM(InstitutionProduct.Quantity)
FROM
Institution
LEFT JOIN
InstitutionProduct
ON InstitutionProduct.InstitutionID = Institution.ID
GROUP BY
Institution.InstitutionTypeID,
InstitutionProduct.ProductID
If you are querying with group by you need to use either aggregate functions or group by all included fields. The reason is, that the 'group by' returns exactly one row per 'group by' value, so if you introduce an ungrouped field, this would conflict if the field has more than one value per grouping constraint. Even though this might not be the case for your dataset, the query engine cannot know this, and raises an error.
The solution is to introduce aggregates for all non-grouping field with aggregates being (among others): average (avg), summarize (sum), minimum (min) and maximum (max). This would lead to something like
SELECT i.InstitutionTypeID, i.Institution.ID, SUM(ip.Quantity)
FROM Institution I LEFT JOIN InstitutionProduct IP
ON IP.InstituationID = I.ID
GROUP BY i.InstitutionTypeID, i.Institution.ID

Access query: find highest rank product in each category

I have a query that calculates product rank per Category:
qryEvalProd
category Product Rank
-------- ------- ----
Cat1 Prod6 1254
Cat1 Prod1 950
Cat1 Prod2 800
Cat2 Prod3 1500
Cat2 Prod5 950
I want to make a query on that, to return the best product for each category:
category Product
-------- -------
Cat1 Prod6
Cat2 Prod3
I know I could do that using a correlated subquery containing a group by and Max, however but for performance reason I am trying to make it in one shot using Group By and First. But I can't express the fact I want the First Prod of each cat sorted by DESC Rank.
Is there a way to do that in one pass ?
This tweak lets you add the ORDER you wanted. It uses a subquery...but not a correlated subquery, so performance is not impaired:
SELECT Category, First(Product) AS BestProduct
FROM (
SELECT Category, Product, Rank
FROM qryEvalProd
ORDER BY Category, Rank DESC
) AS Ordered
GROUP BY Category;
Perhaps the use of the HAVING clause may help?
In fact this seems to work. I am just a bit annoyed because I can't force in my statement the order of the input dataset, but anyway, it wors..
SELECT Category, First(Product) AS BestProduct
FROM qryEvalProd
GROUP BY Category;
I can add ORDER BY Category without any impact on the output. I would have liked to add ORDER BY Category, Rank to be sure, but that's not accepted.
If someone has a better suggestion you're still welcome.
You might start with something like this:
SELECT Category, Product
FROM qryEvalProd
WHERE Product =
(SELECT TOP 1 Self.Product
FROM qryEvalProd AS Self
WHERE Self.Category = qryEvalProd.Category
ORDER BY Self.Rank DESC)
If there is the possibility of ties (that is, two products in the same category with the same rank), this query will arbitrarily pick one of them.
If you want it to always pick the same one, you can probably make it do what you want by changing the the ORDER BY clause. For example, if in the event of a tie, you want it to pick the one with the first Product value in alphabetical order, you might change the ORDER BY clause to:
ORDER BY Self.Rank DESC, Self.Product
If you want it instead to list all the products with the highest rank, you might use a query like this:
SELECT Category, Product
FROM qryEvalProd
WHERE Rank =
(SELECT Max(Self.Rank)
FROM qryEvalProd AS Self
WHERE Self.Category = qryEvalProd.Category)