Sorting with sql - sql

I have a question on how to sort data using sql. For that I made up a simple example to illustrate my problem.
if object_id('MyTable', 'U') is not null drop table dbo.MyTable;
create table MyTable
(
country varchar(10) not null
, town varchar( 10 ) not null
, amount int not null
)
insert into MyTable values
( 'US', 'NYC', 100 )
, ( 'US', 'BAL', 150 )
, ( 'US', 'WAS', 200 )
, ( 'CA', 'MON', 100 )
, ( 'CA', 'TOR', 150 )
, ( 'CA', 'VAN', 200 )
How can I sort the data in a sense the all "countries are sorted by the amount in descending order AND that the towns in alphabetical order for each country.
Thanks,
Christian

To sort in SQL, use Order By: http://msdn.microsoft.com/en-us/library/ms188385.aspx
So if you wanted sorted by Country, then Amount, then Town, you'd add an Order By clause after your Where class like:
ORDER BY Country, Amount DESC, Town

I think that this should do it:
SELECT
country,
SUM(amount) OVER(PARTITION BY country) as CountryTotal,
town,
amount
FROM MyTable
ORDER BY CountryTotal, country, town

I am not sure whether you want to order by total amount of a country or just amount .
for total amount of country use below.
select
mt.country,
ctotal.countrycotal,
mt.town,
mt.amount
from
(
SELECT
country,
SUM(amount) as countrycotal,
FROM MyTable
group BY country
) ctotal
inner join Mytable mt on ctotal.country = mt.country
order by ctotal.countrytotal,ctotal.country,mt.town
for just amount use below
select * from Mytable
order by mt.amount,mt.country,mt.town

Related

How to select columns that aren't part of an aggregate query using HAVING SUM() in the WHERE and selecting only certain rows on db2

Using AS400 db2 for this.
I have a table of orders. From that table I have to:
Get all orders from a specified list of order IDs and type
Group by the user_id on those orders
Check to make sure the total order amount on the group is greater than $100
Return all orders that matched the group but the results won't be grouped, which includes order_id which is not part of the group
I got a bit stuck because the AS400 did not like that I was asking to select a field that wasn't part of the group, which I need.
I came up with this query, but it's slow.
-- Create a common temp table we can use in both places
WITH wantedOrders AS (
SELECT order_id FROM orders
WHERE
-- Only orders from the web
order_type = 'web'
-- And only orders that we want to get at this time
AND order_id IN
(
50,
20,
30
)
)
-- Our main select that gets all order information, even the non-grouped stuff
SELECT
t1.order_id,
t1.user_id,
t1.amount,
t2.total_amount,
t2.count
FROM orders AS t1
-- Join in the group data where we can do our query
JOIN (
SELECT
user_id,
SUM(amount) as total_amount,
COUNT(*) AS count
FROM
orders
-- Re use the temp table to get the order numbers
WHERE order_id IN (SELECT order_id FROM wantedOrders)
GROUP BY
user_id
HAVING SUM(amount)>100
) AS t2 ON t2.user_id=t1.user_id
-- Make sure we only use the order numbers
WHERE order_id IN (SELECT order_id FROM wantedOrders)
ORDER BY t1.user_id ASC;
What's the better way to write this query?
Try this:
WITH
wantedOrders (order_id) AS
(
VALUES 1, 2
)
, orders (order_id, user_id, amount) AS
(
VALUES
(1, 1, 50)
, (2, 1, 50)
, (1, 2, 60)
, (2, 2, 60)
, (3, 3, 200)
, (4, 3, 200)
)
-- Our main select that gets all order information, even the non-grouped stuff
SELECT *
FROM
(
SELECT
order_id,
user_id,
amount,
SUM (amount) OVER (PARTITION BY user_id) AS total_amount,
COUNT (*) OVER (PARTITION BY user_id) AS count
FROM orders t
WHERE EXISTS
(
SELECT 1
FROM wantedOrders w
WHERE w.order_id = t.order_id
)
) A
WHERE total_amount > 100
ORDER BY user_id ASC
ORDER_ID
USER_ID
AMOUNT
TOTAL_AMOUNT
COUNT
1
2
60
120
2
2
2
60
120
2
If order_id is the PK of the table. Then just add the columns you need to the wantedOrders query and use it as your "base" (instead of using orders and refiltering it. You should end up joining wantedOrders with itself.
You can do:
select t.*
from orders t
join (
select user_id
from orders t
where order_id in (50, 20, 30)
group by user_id
having sum(total_amount) > 100
) s on s.user_id = t.user_id
The first table orders as t will produce the data you want. It will be filtered by the second "table expression" s that preselects the groups according to your logic.

How I can filter a table to retrieve only one ocurence of each recors

I am trying to find only one occurrence for each customer.
However, in my database I have customers that have been added twice (following an ERP migration)
Currently,
If I try to find a customer that has two occurrences, I have to keep the customer that has a 'C' in the "customer_id" column
In this example we have "Manu Johns" who appears 2x so we must keep the one who has a 'C' in the customer_id column in the final table.
If I only find one occurrence of this customer. But, which does not have a 'C' in the customer_id column. We have to add it as is in the final table
In this example we have "Mathieu Wainers" which appears only once we keep it as it is in the final table
Which query would allow me to have this result : https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=9484f43c0a6c1ccdae7d659ca53e1eab
CREATE TABLE PersonsInitial (
tel int,
firstname varchar(255),
lastname varchar(255),
Customer_ID varchar(255)
);
insert into PersonsInitial(tel,firstname,lastname,Customer_ID) values
('01234','Manu','Johns','456'),
('01234','Manu','Johns','C456'),
('21234','Fernand','Wajk','C389'),
('13554','Mathieu','Wainers','4683');
select distinct tel, firstname, lastname, customer_id from PersonsInitial
--if there is a person with the same tel number chose the customer id with 'C'
--if I don't have the choice add the customer without C
CREATE TABLE PersonsFinal (
tel int,
firstname varchar(255),
lastname varchar(255),
Customer_ID varchar(255)
);
insert into PersonsFinal(tel,firstname,lastname,Customer_ID) values
('01234','Manu','Johns','C456'),
('21234','Fernand','Wajk','C389'),
('13554','Mathieu','Wainers','4683');
select distinct tel, firstname, lastname, customer_id from PersonsFinal
You may rank them first based on whether it has or not "C" in the customer id. That's why cte is here for.
with cte as (select row_number() over (partition by tel, firstname, lastname order by case when left(customer_id, 1) = 'C' then 0 else 1 end) rn,
p.*
from PersonsInitial p)
select *
from cte
where rn = 1; <-- selects only those with "C" or those for that there is no "C" lines
dbfiddle
There are multiple solutions to this problem. You can, for example use OUTER APPLY. ie:
insert into PersonsFinal(tel,firstname,lastname,Customer_ID)
select distinct pi1.tel, pi1.firstname, pi1.lastname, coalesce(pi.Customer_ID, pi1.Customer_ID) Customer_Id
from PersonsInitial pi1
outer apply (select top(1) *
from PersonsInitial pi2
where pi1.tel = pi2.tel
and pi1.firstname = pi2.firstname
and pi1.lastname = pi2.lastname
and pi2.Customer_ID like 'C%') pi;
DBFiddle demo
Another solution :
WITH CTE AS
(
SELECT tel,
firstname,
lastname,
Customer_ID,
ROW_NUMBER() OVER (PARTITION BY Customer_ID ORDER BY COL_LENGTH('PersonsInitial','Customer_ID') DESC) AS RowNumber
FROM PersonsInitial
)
SELECT tel,
firstname,
lastname,
Customer_ID
FROM CTE
WHERE RowNumber = 1

SQL Query to identify "Top Performers" [?]

I'm still learning Oracle SQL and would like your guidance.
Let say, we have MONTHLY_SALES_TOTALS table that has 3 fields: name, region, amount. We need to determine the best sales people per region. Best means that their amount is equal to the maximum for the region.
CREATE TABLE montly_sales_totals
(
name varchar(20),
amount numeric(9),
region varchar(30)
);
INSERT ALL
INTO montly_sales_totals (name, amount, region) VALUES ('Peter', 55555, 'east')
INTO montly_sales_totals (name, amount, region) VALUES ('Susan', 55555, 'east')
INTO montly_sales_totals (name, amount, region) VALUES ('Mark', 1000000, 'south')
INTO montly_sales_totals (name, amount, region) VALUES ('Glenn', 50000, 'east')
INTO montly_sales_totals (name, amount, region) VALUES ('Paul', 500000, 'south')
SELECT * from dual;
Possible solution:
SELECT m1.name, m1.region, m1.amount
FROM montly_sales_totals m1
JOIN
(SELECT MAX(amount) max_amount, region FROM montly_sales_totals GROUP BY region) m2
ON (m1.region = m2.region)
WHERE m1.amount = m2.max_amount
ORDER by 2,1;
SQL Fiddle: http://sqlfiddle.com/#!4/6a2d8/6
Now my questions:
How efficient is such query?
How can/should it be simplified and/or improved?
I could not use Top since the number of "max" rows vary by region. Is it another direct functionality I could've used instead?
I would use RANK():
SELECT *
FROM (
SELECT name, amount, region,
RANK() OVER (PARTITION BY region ORDER BY amount DESC) rnk
FROM montly_sales_totals
) t
WHERE t.rnk = 1
Here's a modified version of the SQL Fiddle
There are a number of ways one can go about this. Here's another:
select S.region, S.name, V.regionmax
from sales as S
inner join
(
select region, max(amount) as regionmax
from sales group by region
) as V
on S.region = V.region and S.amount = regionmax
As to efficiency, the main factor is the use of the proper index(es). Inline views can perform very well.
I like CTE syntax, but using that website the time taken is the same 2ms, so I can't beat yours :)
with Maximums as (
SELECT region,
MAX(amount) max_amount
FROM montly_sales_totals GROUP BY region
)
SELECT m1.name, m1.region, m1.amount
FROM montly_sales_totals m1, Maximums
WHERE (m1.amount = Maximums.max_amount)
and (m1.region = Maximums.region)
ORDER by 2,1;
you can do this by using the function too...
select * from (select m1.*, row_number( ) over (partition by m1.region order by m1.amount desc,m1.name desc ) max_sal from montly_sales_totals m1 ) where max_sal =1 ;
this query can do one extra thing if both employee sal are same!

SQL Group BY SUM one column and select of first row of grouped items

I have a part table where I have 5 fields. I want to sum the QTY of the mfgpn while showing the first returned row for the other 3 fields (Manfucturer, DateCode, Description). I initially thought of using the MIN function as follows, but that doesn't really help me insofar as that the data is not a int data type. How would I go about doing this? Right now I'm stuck at the following query below:
SELECT SUM([QTY]) AS QTY
,[MFGPN]
,MIN([MANUFACTURER]) AS MANUFACTURER
,MIN([DATECODE]) AS DateCode
,MIN([DESCRIPTION]) AS DESCRIPTION
INTO part
GROUP BY MFGPN, MANUFACTURER, DATECODE, description
ORDER BY mfgpn ASC
Would CROSS APPLY work for you?
SELECT
SUM(a.[QTY]) AS QTY
,a.[MFGPN]
,c.[MANUFACTURER]
,c.[DATECODE]
,c.[DESCRIPTION]
FROM part a
CROSS APPLY (SELECT TOP 1 * FROM part b WHERE a.[MFGPN] = b.[MFGPN]) c
GROUP BY
a.[MFGPN]
,c.[MANUFACTURER]
,c.[DATECODE]
,c.[DESCRIPTION]
Tested with the following:
DECLARE #T1 AS TABLE (
[QTY] int
,[MFGPN] NVARCHAR(50)
,[MANUFACTURER] NVARCHAR(50)
,[DATECODE] DATE
,[DESCRIPTION] NVARCHAR(50));
INSERT #T1 VALUES
(2, 'MFGPN-1', 'MANUFACTURER-A', '20120101', 'A-1'),
(4, 'MFGPN-1', 'MANUFACTURER-B', '20120102', 'B-1'),
(3, 'MFGPN-1', 'MANUFACTURER-C', '20120103', 'C-1'),
(1, 'MFGPN-2', 'MANUFACTURER-A', '20120101', 'A-2'),
(5, 'MFGPN-2', 'MANUFACTURER-B', '20120101', 'B-2')
SELECT
SUM(a.[QTY]) AS QTY
,a.[MFGPN]
,c.[MANUFACTURER]
,c.[DATECODE]
,c.[DESCRIPTION]
FROM #T1 a
CROSS APPLY (SELECT TOP 1 * FROM #T1 b WHERE a.[MFGPN] = b.[MFGPN]) c
GROUP BY
a.[MFGPN]
,c.[MANUFACTURER]
,c.[DATECODE]
,c.[DESCRIPTION]
Produces
QTY MFGPN MANUFACTURER DATECODE DESCRIPTION
9 MFGPN-1 MANUFACTURER-A 2012-01-01 A-1
6 MFGPN-2 MANUFACTURER-A 2012-01-01 A-2
This can be easily managed with a windowed SUM():
WITH summed_and_ranked AS (
SELECT
MFGPN,
MANUFACTURER,
DATECODE,
DESCRIPTION,
QTY = SUM(QTY) OVER (PARTITION BY MFGPN),
RNK = ROW_NUMBER() OVER (
PARTITION BY MFGPN
ORDER BY DATECODE -- or which column should define the order?
)
FROM atable
)
SELECT
MFGPN,
MANUFACTURER,
DATECODE,
DESCRIPTION,
QTY,
INTO parts
FROM summed_and_ranked
WHERE RNK = 1
;
For every row, the total group quantity and the ranking within the group is calculated. When actually getting rows for inserting into the new table (the main SELECT), only rows with RNK values of 1 are pulled. Thus you get a result set containing group totals as well as details of certain rows.

PostgreSQL SELECT the last order per customer per date range

In PostgreSQL:
I have a Table that has 3 columns:
CustomerNum, OrderNum, OrderDate.
There may(or may not) be many orders for each customer per date range. What I am needing is the last OrderNum for each Customer that lies in the date range that is supplied.
What I have been doing is getting a ResultSet of the customers and querying each one separately, but this is taking too much time.
Is there any way of using a sub-select to select out the customers, then get the last OrderNum for each Customer?
On postgres you can also use the non-standard DISTINCT ON clause:
SELECT DISTINCT ON (CustomerNum) CustomerNum, OrderNum, OrderDate
FROM Orders
WHERE OrderDate BETWEEN 'yesterday' AND 'today'
ORDER BY CustomerNum, OrderDate DESC;
See http://www.postgresql.org/docs/current/static/sql-select.html#SQL-DISTINCT
select customernum, max(ordernum)
from table
where orderdate between '...' and '...'
group by customernum
that's all.
SELECT t1.CustomerNum, t1.OrderNum As LastOrderNum, t1.LastOrderDate
FROM table1 As t1
WHERE t1.OrderDate = (SELECT MAX(t2.OrderDate)
FROM table1 t2
WHERE t1.CustomerNum = t2.CustomerNum
AND t2.OrderDate BETWEEN date1 AND date2)
AND t1.OrderDate BETWEEN date1 AND date2
Not sure about your Customer table's structure or relationships, but this should work:
SELECT Customer.Num, (
SELECT OrderNum FROM Orders WHERE CustomerNum = Customer.Num AND OrderDate BETWEEN :start AND :end ORDER BY OrderNum DESC LIMIT 1
) AS LastOrderNum
FROM Customer
If by last order number you mean the largest order number then you can just use your select as the predicate for customer num, group the results and select the maximum:
SELECT CustomerNum, MAX(OrderNum) AS LastOrderNum
FROM Orders
WHERE
CustomerNum IN (SELECT CustomerNum FROM ...)
AND
OrderDate BETWEEN :first_date AND :last_date
GROUP BY CustomerNum
If the last order number isn't necessarily the largest order number then you'll need to either find the largest order date for each customer and join it together with the rest of the orders to find the corresponding number(s):
SELECT O.CustomerNum, O.OrderNum AS LastOrderNum
FROM
(SELECT CustomerNum, MAX(OrderDate) AS OrderDate
FROM Orders
WHERE
OrderDate BETWEEN :first_date AND :last_date
AND
CustomerNum IN (SELECT CustomerNum FROM ...)
GROUP BY CustomerNum
) AS CustLatest
INNER JOIN
Orders AS O USING (CustomerNum, OrderDate);
-- generate some data
DROP TABLE tmp.orders;
CREATE TABLE tmp.orders
( id INTEGER NOT NULL
, odate DATE NOT NULL
, payload VARCHAR
)
;
ALTER TABLE tmp.orders ADD PRIMARY KEY (id,odate);
INSERT INTO tmp.orders(id,odate,payload) VALUES
(1, '2011-10-04' , 'one' )
, (1, '2011-10-24' , 'two' )
, (1, '2011-10-25' , 'three' )
, (1, '2011-10-26' , 'four' )
, (2, '2011-10-23' , 'five' )
, (2, '2011-10-24' , 'six' )
;
-- CTE to the rescue ...
WITH sel AS (
SELECT * FROM tmp.orders
WHERE odate BETWEEN '2011-10-23' AND '2011-10-24'
)
SELECT * FROM sel s0
WHERE NOT EXISTS (
SELECT * FROM sel sx
WHERE sx.id = s0.id
AND sx.odate > s0.odate
)
;
result:
DROP TABLE
CREATE TABLE
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index "orders_pkey" for table "orders"
ALTER TABLE
INSERT 0 6
id | odate | payload
----+------------+---------
1 | 2011-10-24 | two
2 | 2011-10-24 | six
(2 rows)