Compare Previous column data with the next column data - sql

I have a sales table with the following columns. I want to select the rows where sale price is increasing and skip those decrease sale price in which the sale price of above row is increase.
e.g. in the following table, I would like to have all rows except row having saleid=4
+--------+--------+-----------+
| SaleId | ItemId | SalePrice |
+--------+--------+-----------+
| 1 | 987 | 12 |
+--------+--------+-----------+
| 2 | 678 | 13 |
+--------+--------+-----------+
| 3 | 987 | 15 |
+--------+--------+-----------+
| 4 | 542 | 11 |
+--------+--------+-----------+
| 5 | 678 | 16 |
+--------+--------+-----------+
I have tried using inner join. But it shows nothing.
Here is the query I have wrote:
select s1.* from saletable s1
join saletable s2 on s1.saleid = s2.saleid
where s1.saleprice<s2.saleprice

Consider the following solution using running max
select t.*
from
(
select *, max(SalePrice) over (order by SaleId) runningMaxSalePrice
from testdata
) t
where t.SalePrice >= t.runningMaxSalePrice
This solution skips more than one consecutive row with decreasing SalePrice.
DBFdiddle DEMO

Use lag():
select st.*
from (select st.*, lag(saleprice) over (order by saleid ) as prev_saleprice
from saletable st
) st
where prev_saleprice is null or saleprice > prev_saleprice

Related

SQL how to calculate median not based on rows

I have a sample of cars in my table and I would like to calculate the median price for my sample with SQL. What is the best way to do it?
+-----+-------+----------+
| Car | Price | Quantity |
+-----+-------+----------+
| A | 100 | 2 |
| B | 150 | 4 |
| C | 200 | 8 |
+-----+-------+----------+
I know that I can use percentile_cont (or percentile_disc) if my table is like this:
+-----+-------+
| Car | Price |
+-----+-------+
| A | 100 |
| A | 100 |
| B | 150 |
| B | 150 |
| B | 150 |
| B | 150 |
| C | 200 |
| C | 200 |
| C | 200 |
| C | 200 |
| C | 200 |
| C | 200 |
| C | 200 |
| C | 200 |
+-----+-------+
But in the real world, my first table has about 100 million rows and the second table should have about 3 billiard rows (and moreover I don't know how to transform my first table into the second).
Here is a way to do this in sql server
In the first step i do is calculate the indexes corresponding to the lower and upper bounds for the median (if we have odd number of elements then the lower and upper bounds are same else its based on the x/2 and x/2+1th value)
Then i get the cumulative sum of the quantity and the use that to choose the elements corresponding to the lower and upper bounds as follows
with median_dt
as (
select case when sum(quantity)%2=0 then
sum(quantity)/2
else
sum(quantity)/2 + 1
end as lower_limit
,case when sum(quantity)%2=0 then
(sum(quantity)/2) + 1
else
sum(quantity)/2 + 1
end as upper_limit
from t
)
,data
as (
select *,sum(quantity) over(order by price asc) as cum_sum
from t
)
,rnk_val
as(select *
from (
select price,row_number() over(order by d.cum_sum asc) as rnk
from data d
join median_dt b
on b.lower_limit<=d.cum_sum
)x
where x.rnk=1
union all
select *
from (
select price,row_number() over(order by d.cum_sum asc) as rnk
from data d
join median_dt b
on b.upper_limit<=d.cum_sum
)x
where x.rnk=1
)
select avg(price) as median
from rnk_val
+--------+
| median |
+--------+
| 200 |
+--------+
db fiddle link
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=c5cfa645a22aa9c135032eb28f1749f6
This looks right on few results, but try on a larger set to double-check.
First create a table which has the total for each car (or use CTE or sub-query), your choice. I'm just creating a separate table here.
create table table2 as
(
select car,
quantity,
price,
price * quantity as total
from table1
)
Then run this query, which looks for the price group that falls in the middle.
select price
from (
select car, price,
sum(total) over (order by car) as rollsum,
sum(total) over () as total
from table2
)a
where rollsum >= total/2
Correctly returns a value of $200.

sql query to find unique records

I am new to sql and need your help to achieve the below , I have tried using group and count functions but I am getting all the rows in the unique group which are duplicated.
Below is my source data.
CDR_ID,TelephoneNo,Call_ID,call_Duration,Call_Plan
543,xxx-23,12,12,500
543,xxx-23,12,12,501
543,xxx-23,12,12,510
643,xxx-33,11,17,700
343,xxx-33,11,17,700
766,xxx-74,32,1,300
766,xxx-74,32,1,300
877,xxx-32,12,2,300
877,xxx-32,12,2,300
877,xxx-32,12,2,301
Please note :-the source has multiple combinations of unique records, so when I do the count the unique set is not appearing as count =1
example :- the below data in source have 60 records for each combination
877,xxx-32,12,2,300 -- 60 records
877,xxx-32,12,2,301 -- 60 records
I am trying to get the unique unique records, but the duplicate records are also getting in
Below are the rows which should come up in the unique group. i.e. there will be multiple call_Plans for the same combinations of CDR_ID,TelephoneNo,Call_ID,call_Duration. I want to read records for which there is only one call plan for each unique combination of CDR_ID,TelephoneNo,Call_ID,call_Duration,
CDR_ID,TelephoneNo,Call_ID,call_Duration,Call_Plan
643,xxx-33,11,17,700
343,xxx-33,11,17,700
766,xxx-74,32,1,300
Please advice on this.
Thanks and Regards
To do more complex groupings you could also use a Common Table Expression/Derived Table along with windowed functions:
declare #t table(CDR_ID int,TelephoneNo nvarchar(20),Call_ID int,call_Duration int,Call_Plan int);
insert into #t values (543,'xxx-23',12,12,500),(543,'xxx-23',12,12,501),(543,'xxx-23',12,12,510),(643,'xxx-33',11,17,700),(343,'xxx-33',11,17,700),(766,'xxx-74',32,1,300),(766,'xxx-74',32,1,300),(877,'xxx-32',12,2,300),(877,'xxx-32',12,2,300),(877,'xxx-32',12,2,301);
with cte as
(
select CDR_ID
,TelephoneNo
,Call_ID
,call_Duration
,Call_Plan
,count(*) over (partition by CDR_ID,TelephoneNo,Call_ID,call_Duration) as c
from (select distinct * from #t) a
)
select *
from cte
where c = 1;
Output:
+--------+-------------+---------+---------------+-----------+---+
| CDR_ID | TelephoneNo | Call_ID | call_Duration | Call_Plan | c |
+--------+-------------+---------+---------------+-----------+---+
| 343 | xxx-33 | 11 | 17 | 700 | 1 |
| 643 | xxx-33 | 11 | 17 | 700 | 1 |
| 766 | xxx-74 | 32 | 1 | 300 | 1 |
+--------+-------------+---------+---------------+-----------+---+
using not exists()
select distinct *
from t
where not exists (
select 1
from t as i
where i.cdr_id = t.cdr_id
and i.telephoneno = t.telephoneno
and i.call_id = t.call_id
and i.call_duration = t.call_duration
and i.call_plan <> t.call_plan
)
rextester demo: http://rextester.com/RRNNE20636
returns:
+--------+-------------+---------+---------------+-----------+-----+
| cdr_id | TelephoneNo | Call_id | call_Duration | Call_Plan | cnt |
+--------+-------------+---------+---------------+-----------+-----+
| 343 | xxx-33 | 11 | 17 | 700 | 1 |
| 643 | xxx-33 | 11 | 17 | 700 | 1 |
| 766 | xxx-74 | 32 | 1 | 300 | 1 |
+--------+-------------+---------+---------------+-----------+-----+
Basically you should try this:
SELECT A.CDR_ID, A.TelephoneNo, A.Call_ID, A.call_Duration, A.Call_Plan
FROM YOUR_TABLE A
INNER JOIN (SELECT CDR_ID,TelephoneNo,Call_ID,call_Duration
FROM YOUR_TABLE
GROUP BY CDR_ID,TelephoneNo,Call_ID,call_Duration
HAVING COUNT(*)=1
) B ON A.CDR_ID= B.CDR_ID AND A.TelephoneNo=B.TelephoneNo AND A.Call_ID=B.Call_ID AND A.call_Duration=B.call_Duration
You can do a shorter query using Windows Function COUNT(*) OVER ...
Below query will provide you the result
SELECT CDR_ID,TelephoneNo,Call_ID,call_Duration,Call_Plan, COUNT(*)
FROM TABLE_NAME GROUP BY CDR_ID,TelephoneNo,Call_ID,call_Duration,Call_Plan
HAVING COUNT(*) < 2;
It gives you with the count as well. If not required you can remove it.
Select *, count(CDR_ID)
from table
group by CDR_ID, TelephoneNo, Call_ID, call_Duration, Call_Plan
having count(CDR_ID) = 1

Find one single row for a column with a unique value using SQL

I have a table which contains data that similar to this:
RowID | CustomerID | Quantity | Type | .....
1 | 345 | 100 | Software | .....
2 | 1280 | 200 | Software | .....
3 | 456 | 20 | Hub | .....
4 | 345 | 100 | Software | .....
5 | 345 | 180 | Monitor | .....
6 | 23 | 15 | Router | .....
7 | 1280 | 120 | Software | .....
8 | 345 | 5 | Mac | .....
.... | .... | ... | ..... | .....
The database have hundreds of thousand of rows. As you can see, the CustomerID has duplicates.
What I want to do is to find EXACTLY ONE row for each unique CustomerID and Type combination and with Quantity more than 10.
For example, for the above table, I want to get:
RowID | CustomerID | Quantity | Type | .....
2 | 1280 | 200 | Software | .....
3 | 456 | 20 | Hub | .....
4 | 345 | 100 | Software | .....
5 | 345 | 180 | Monitor | .....
6 | 23 | 15 | Router | .....
What I tried to do is:
select distinct CustomerID, Type from MyTable
where Quantity > 10
Which gives me:
CustomerID | Type
1280 | Software
456 | Hub
345 | Software
345 | Monitor
23 | Router
But I don't know how to select other columns because if I do:
select distinct CustomerID, Type, RowID, Quantity from MyTable
where Quantity > 10
It returns every rows because the RowID is unique.
I think maybe I should use a subquery by iterating the result of the above query. Can someone help me on this?
Use Partition Over. This will allow you to group all similar rows together, and then you query that table to get just the first row. Note: An "order by" must be specified in the partition, even if you don't use the value. But it is useful for pulling the combination with the highest quantity. If you also want distinct Quantity, add that column to the select in the partition.
select CustomerId
, Type
FROM
(
select
CustomerId
, Type
, row_number() over (partition by CustomerId, Type order by Quantity desc) as rn
From MyTable
where Quantity > 10
) dta
Where rn = 1
Something like this will work (unless you have more requirements that you didn't mention):
SELECT CustomerID, Type, SUM(Quantity) AS Quantity
FROM MyTable
GROUP BY CustomerID, Type
HAVING SUM(Quantity) > 10
You need to choose which one of the "duplicated" rows to retrieve.
I wrote duplicated with quotes because they are not technically duplicated:
+-------+------------+----------+----------+
| RowID | CustomerID | Type | Quantity |
+-------+------------+----------+----------+
| 1 | 345 | Software | 100 |
| 2 | 345 | Software | 200 |
| 3 | 345 | Software | 300 |
+-------+------------+----------+----------+
All of this are different rows because of the different RowID and Quantity columns.
So you must to specify which one of these you want to retrieve.
For this example I will use the RowID and Quantity with the minimum value.
So I will tell SQL to pick this one, for this I will order the table by RowID and Quantity in ascending order and I will do a join with the same table
so I can pick up the first row with the lower RowID and Quantity for the same CustomerID and Type.
+-------+------------+----------+----------+
| RowID | CustomerID | Type | Quantity |
+-------+------------+----------+----------+
| 1 | 345 | Software | 100 |
+-------+------------+----------+----------+
The SQL code for this is the following:
SELECT
*
FROM
MyTable originalTable
WHERE
originalTable.Quantity > 10 AND
originalTable.RowID =
(
SELECT TOP 1 orderedTable.RowID
FROM MyTable orderedTable
WHERE orderedTable.CustomerID = originalTable.CustomerID AND orderedTable.Type = originalTable.Type
ORDER BY orderedTable.RowID ASC, orderedTable.Quantity ASC
)
One way is to use the row_number window function as partition the data by CustomerID and Type, and the filter out the first rows in each partition.
WITH Uniq AS (
SELECT
CustomerID, Type, RowID, Quantity,
rn = ROW_NUMBER() OVER (PARTITION BY CustomerID, Type ORDER BY RowID)
FROM MyTable WHERE Quantity > 10
)
SELECT * FROM Uniq WHERE rn = 1;
SQL Fiddle
Or you could find the a unique RowID (min or max) for each group of CustomerID and Type and use that as a source in a join, either as a common table expression of derived table:
WITH Uniq AS (
SELECT MIN(RowID) RowID FROM MyTable WHERE Quantity > 10 GROUP BY CustomerID, Type
)
SELECT MyTable.* FROM MyTable JOIN Uniq ON MyTable.RowID = Uniq.RowID
Sample SQL Fiddle

SQL Server : query grouping

I have some queries in SQL Server. I have two tables
keyword_text
Keyword_relate
Columns in keyword_text:
key_id
keywords
Columns in keyword_relate:
key_id
product_id
score
status
Sample data for keyword_text:
----|----------
1 | Pencil
2 | Pen
3 | Books
Sample data for keyword_relate:
----------------------------
Sno| Product | SCore|status
---------------------------
1 | 124 | 2 | 1
1 | 125 | 3 | 1
2 | 124 | 3 | 1
2 | 125 | 2 | 1
From this I want to get the product_id, grouped by keywords and which have maximum score
Presuming that key_id of first table is Sno in second table. You can use ROW_NUMBER:
WITH CTE AS
(
SELECT Product AS ProductID, Score As MaxScore,
RN = ROW_NUMBER() OVER (PARTITION BY kt.key_id ORDER BY Score DESC)
FROM keyword_text kt INNER JOIN keyword_relate kr
ON kt.key_id = kr.Sno
)
SELECT ProductID, MaxScore
FROM CTE
WHERE RN = 1

Recursive function in SQL [duplicate]

This question already has answers here:
How to get cumulative sum
(16 answers)
Closed 9 years ago.
i have the data in my table as follows,
+----+-----+
| ID | Qty |
+----+-----+
| 1 | 100 |
| 2 | 200 |
| 3 | 150 |
| 4 | 50 |
+----+-----+
i need the result as follows,
+----+-----+-------+
| ID | Qty | C.Qty |
+----+-----+-------+
| 1 | 100 | 100 |
| 2 | 200 | 300 |
| 3 | 150 | 450 |
| 4 | 50 | 500 |
+----+-----+-------+
the result of third column will be the sum of previous rows,
please any one help....
I would just use a subquery:
SELECT ID, Qty,
(SELECT SUM(Qty) FROM [My Table] b WHERE b.ID <= [My Table].ID) AS [Total Qty]
FROM [My Table]
Please try:
SELECT S1.ID, S1.Qty ,sum(S2.Qty) CUM_SUM
FROM YourTable S1 join YourTable S2
on S1.ID>=S2.ID
group by S1.ID, S1.Qty
ORDER BY S1.ID
SELECT ID, Qty,
SUM(Qty) OVER(ORDER BY ID
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
AS C.QTY
FROM Table
try this:
select a.id,a.qty,sum(b.qty) as total_qty
from table a cross join table b
where b.id <= a.id
group by a.id,a.qty
order by a.id
demo