Omit Duplicate Only When In Sequential Order

Omit Duplicate Only When In Sequential Order - sql

I've got a query returning the following information. I used Over(PARTITION BY to include row numbers. I'm capturing every time my work_center_S begins a new Order#, but I want to exclude the beginning of a new order when the part_number was the same as the order/row previous to it. I'm not able to use the DISTINCT function because a part_number may appear numerous times a day and I'll need to capture every time such a change occurs.
[![Query Return][1]][1]
[1]: https://i.stack.imgur.com/IKvsR.jpg
+----+--------+---------------+-------------+------+
| rn | Order# | work_center_S | part_number | Hour |
+----+--------+---------------+-------------+------+
| 1 | 7098 | TB312 | 37203 | 1 |
+----+--------+---------------+-------------+------+
| 2 | 8797 | TB312 | 37194 | 4 |
+----+--------+---------------+-------------+------+
| 3 | 8802 | TB312 | 37355 | 11 |
+----+--------+---------------+-------------+------+
| 4 | 0946 | TB312 | 37194 | 15 |
+----+--------+---------------+-------------+------+
| 5 | 0698 | TB312 | 37203 | 18 |
+----+--------+---------------+-------------+------+
| 6 | 0699 | TB312 | 37203 | 21 |
+----+--------+---------------+-------------+------+

I assume there isn't a -1 part_number
select Order#,work_center_S,part_number,Hour
from (select *
,lag(part_number,1,-1) over
(
partition by work_center_S
order by Hour
) as prev_part_number
from mytable
) t
where part_number <> prev_part_number
--
+--------+---------------+-------------+------+
| Order# | work_center_S | part_number | Hour |
+--------+---------------+-------------+------+
| 7098 | TB312 | 37203 | 1 |
+--------+---------------+-------------+------+
| 8797 | TB312 | 37194 | 4 |
+--------+---------------+-------------+------+
| 8802 | TB312 | 37355 | 11 |
+--------+---------------+-------------+------+
| 946 | TB312 | 37194 | 15 |
+--------+---------------+-------------+------+
| 698 | TB312 | 37203 | 18 |
+--------+---------------+-------------+------+

Related

How I can I add a count to rank null values in SQL Hive?

This is what I have right now:
| time | car_id | order | in_order |
|-------|--------|-------|----------|
| 12:31 | 32 | null | 0 |
| 12:33 | 32 | null | 0 |
| 12:35 | 32 | null | 0 |
| 12:37 | 32 | 123 | 1 |
| 12:38 | 32 | 123 | 1 |
| 12:39 | 32 | 123 | 1 |
| 12:41 | 32 | 123 | 1 |
| 12:43 | 32 | 123 | 1 |
| 12:45 | 32 | null | 0 |
| 12:47 | 32 | null | 0 |
| 12:49 | 32 | 321 | 1 |
| 12:51 | 32 | 321 | 1 |
I'm trying to rank orders, including those who have null values, in this case by car_id.
This is the result I'm looking for:
| time | car_id | order | in_order | row |
|-------|--------|-------|----------|-----|
| 12:31 | 32 | null | 0 | 1 |
| 12:33 | 32 | null | 0 | 1 |
| 12:35 | 32 | null | 0 | 1 |
| 12:37 | 32 | 123 | 1 | 2 |
| 12:38 | 32 | 123 | 1 | 2 |
| 12:39 | 32 | 123 | 1 | 2 |
| 12:41 | 32 | 123 | 1 | 2 |
| 12:43 | 32 | 123 | 1 | 2 |
| 12:45 | 32 | null | 0 | 3 |
| 12:47 | 32 | null | 0 | 3 |
| 12:49 | 32 | 321 | 1 | 4 |
| 12:51 | 32 | 321 | 1 | 4 |
I just don't know how to manage a count for the null values.
Thanks!

You can count the number of non-NULL values before each row and then use dense_rank():
select t.*,
dense_rank() over (partition by car_id order by grp) as row
from (select t.*,
count(order) over (partition by car_id order by time) as grp
from t
) t;

SQL Server : sort ranking list

I have a table with 3 columns ordernum, username, and amount. I want to select its rows and show an additional column expected.
The rule for calculating the expected column is as follows:
These rows same OrderNum value will be ranking again based on amount column (Desc order). I don't know how I can describe, but the expected result is shown below :(
I tried with RANK() and ROW_NUMBER(), but have not been able to properly apply above algorithm.
This is my table declaration:
CREATE TABLE data
(
ordernum INT,
username NVARCHAR(30),
amount MONEY
);
This is my table content:
+----------+----------+------------+
| ORDERNUM | USERNAME | AMOUNT |
+----------+----------+------------+
| 1 | test01 | 18382.5079 |
| 1 | test02 | 10476.0000 |
| 1 | test03 | 8128.0000 |
| 1 | test04 | 6680.0000 |
| 1 | test05 | 5388.9673 |
| 1 | test06 | 5356.0000 |
| 12 | test07 | 2806.0000 |
| 12 | test08 | 2806.0000 |
| 12 | test09 | 2806.0000 |
| 14 | test10 | 2530.0000 |
| 15 | test11 | 2330.0000 |
| 16 | test12 | 2183.0000 |
| 16 | test13 | 2182.0000 |
| 17 | test14 | 2000.0000 |
| 18 | test15 | 1621.0000 |
+----------+----------+------------+
And this is my expected result:
+----------+----------+------------+----------+
| ORDERNUM | USERNAME | AMOUNT | EXPECTED |
+----------+----------+------------+----------+
| 1 | test01 | 18382.5079 | 1 |
| 1 | test02 | 10476.0000 | 2 |
| 1 | test03 | 8128.0000 | 3 |
| 1 | test04 | 6680.0000 | 4 |
| 1 | test05 | 5388.9673 | 5 |
| 1 | test06 | 5356.0000 | 6 |
| 12 | test07 | 2806.0000 | 12 |
| 12 | test08 | 2806.0000 | 12 |
| 12 | test09 | 2806.0000 | 12 |
| 14 | test10 | 2530.0000 | 15 |
| 15 | test11 | 2330.0000 | 16 |
| 16 | test12 | 2183.0000 | 17 |
| 16 | test13 | 2182.0000 | 18 |
| 17 | test14 | 2000.0000 | 19 |
| 18 | test15 | 1621.0000 | 20 |
+----------+----------+------------+----------+
Here is a fiddle for the problem: https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=4014cb469a9ec8f57ded5a5e0e60adaf

This should work.
select
OrderNum,
Username,
Amount,
RANK() over (order by OrderNum) as Expected
from yourTable

PostgreSQL: Count number of rows in table 1 for distinct rows in table 2

I am working with really big data that at the moment I become confused, looking like I'm just repeating one thing.
I want to count the number of trips per user from two tables, trips and session.
psql=> SELECT * FROM trips limit 10;
trip_id | session_ids | daily_user_id | seconds_start | seconds_end
---------+-----------------+---------------+---------------+-------------
400543 | {172079} | 17118 | 1575550944 | 1575551181
400542 | {172078} | 17118 | 1575541533 | 1575542171
400540 | {172077} | 17118 | 1575539001 | 1575539340
400538 | {172076} | 17117 | 1575540499 | 1575541999
400534 | {172074,172075} | 17117 | 1575537161 | 1575539711
400530 | {172073} | 17116 | 1575447043 | 1575447682
400529 | {172071} | 17115 | 1575496394 | 1575497803
400527 | {172070} | 17113 | 1575495241 | 1575496034
400525 | {172068} | 17115 | 1575485658 | 1575489378
400524 | {172067} | 17113 | 1575488721 | 1575490491
(10 rows)
psql=> SELECT * FROM session limit 10;
session_id | user_id | key | start_time | daily_user_id
------------+---------+--------------------------+------------+---------------
172079 | 43 | hLB8S7aSfp4gAFp7TykwYQ==+| 1575550921 | 17118
| | | |
172078 | 43 | YATMrL/AQ7Nu5q2dQTMT1A==+| 1575541530 | 17118
| | | |
172077 | 43 | fOLX4tqvsyFOP3DCyBZf1A==+| 1575538997 | 17118
| | | |
172076 | 7 | 88hwGj4Mqa58juy0PG/R4A==+| 1575540515 | 17117
| | | |
172075 | 7 | 1O+8X49+YbtmoEa9BlY5OQ==+| 1575538384 | 17117
| | | |
172074 | 7 | XOR7hsFCNk+soM75ZhDJyA==+| 1575537405 | 17117
| | | |
172073 | 42 | rAQWwYgqg3UMTpsBYSpIpA==+| 1575447109 | 17116
| | | |
172072 | 276 | 0xOsxRRN3Sq20VsXWjlrzQ==+| 1575511120 | 17114
| | | |
172071 | 7 | P4beN3W/ZrD+TCpZGYh23g==+| 1575496642 | 17115
| | | |
172070 | 43 | OFi30Zv9e5gmLZS5Vb+I7Q==+| 1575495238 | 17113
| | | |
(10 rows)
Goal: get the distribution of trips per user
Attempt:
psql=> SELECT COUNT(distinct trip_id) as trips
, count(distinct user_id) as users
, extract(year from to_timestamp(seconds_start)) as year_date
, extract(month from to_timestamp(seconds_start)) as month_date
FROM trips
INNER JOIN session
ON session_id = ANY(session_ids)
GROUP BY year_date, month_date
ORDER BY year_date, month_date;
+-------+-------+-----------+------------+
| trips | users | year_date | month_date |
+-------+-------+-----------+------------+
| 371 | 44 | 2016 | 3 |
| 12207 | 185 | 2016 | 4 |
| 3859 | 88 | 2016 | 5 |
| 1547 | 28 | 2016 | 6 |
| 831 | 17 | 2016 | 7 |
| 427 | 4 | 2016 | 8 |
| 512 | 13 | 2016 | 9 |
| 431 | 11 | 2016 | 10 |
| 1011 | 26 | 2016 | 11 |
| 791 | 15 | 2016 | 12 |
| 217 | 8 | 2017 | 1 |
| 490 | 17 | 2017 | 2 |
| 851 | 18 | 2017 | 3 |
| 1890 | 66 | 2017 | 4 |
| 2143 | 43 | 2017 | 5 |
| . | | | |
| . | | | |
| . | | | |
+-------+-------+-----------+------------+
This resultset count number of users and trips, my intention is actually to get an analysis of trips per user, like so:
+------+-------------+
| user | no_of_trips |
+------+-------------+
| 1 | 489 |
| 2 | 400 |
| 3 | 12 |
| 4 | 102 |
| . | |
| . | |
| . | |
+------+-------------+
How do I do this, please?

You seem to just want aggregation by user_id:
SELECT s.user_id, COUNT(distinct t.trip_id) as trips
FROM trips t INNER JOIN
session s
ON s.session_id = ANY(t.session_ids)
GROUP BY s.user_id ;
I'm pretty sure that the COUNT(DISTINCT) is unnecessary, so I would advise removing it:
SELECT s.user_id, COUNT(*) as trips
FROM trips t INNER JOIN
session s
ON s.session_id = ANY(t.session_ids)
GROUP BY s.user_id ;

SQL subcategory total is not properly placed at the end of every parent category

I am having trouble in SQl query,The query result should be like this
+------------+------------+-----+------+-------+--+--+--+
| District | Tehsil | yes | no | Total | | | |
+------------+------------+-----+------+-------+--+--+--+
| ABBOTTABAD | ABBOTTABAD | 377 | 5927 | 6304 | | | |
| ABBOTTABAD | HAVELIAN | 112 | 2276 | 2388 | | | |
| ABBOTTABAD | Overall | 489 | 8203 | 8692 | | | |
| CHARSADDA | CHARSADDA | 289 | 3762 | 4051 | | | |
| CHARSADDA | SHABQADAR | 121 | 1376 | 1497 | | | |
| CHARSADDA | TANGI | 94 | 1703 | 1797 | | | |
| CHARSADDA | Overall | 504 | 6841 | 7345 | | | |
+------------+------------+-----+------+-------+--+--+--+
The overall total should be should be shown at the end of every parent category but now it is showing like this
+------------+------------+-----+------+-------+--+--+--+
| District | Tehsil | yes | no | Total | | | |
+------------+------------+-----+------+-------+--+--+--+
| ABBOTTABAD | ABBOTTABAD | 377 | 5927 | 6304 | | | |
| ABBOTTABAD | HAVELIAN | 112 | 2276 | 2388 | | | |
| ABBOTTABAD | Overall | 489 | 8203 | 8692 | | | |
| CHARSADDA | CHARSADDA | 289 | 3762 | 4051 | | | |
| CHARSADDA | Overall | 504 | 6841 | 7345 | | | |
| CHARSADDA | SHABQADAR | 121 | 1376 | 1497 | | | |
| CHARSADDA | TANGI | 94 | 1703 | 1797 | | | |
+------------+------------+-----+------+-------+--+--+--+
My query is sorting second column with respect to first column although order by query is applied on my first column. This is my query
select District as 'District', tName as 'tehsil',[1] as 'yes',[0] as 'no',ISNULL([1]+[0], 0) as "Total" from
(
select d.Name as 'District',
case when grouping (t.Name)=1 then 'Overall' else t.Name end as tName,
BoundaryWallAvailable,
count(*) as total from School s
INNER JOIN SchoolIndicator i ON (i.refSchoolID=s.SchoolID)
INNER JOIN Tehsil t ON (t.TehsilID=s.refTehsilID)
INNER JOIN district d ON (d.DistrictID=t.refDistrictID)
group by
GROUPING sets((d.Name, BoundaryWallAvailable), (d.Name,t.Name, BoundaryWallAvailable))
) B
PIVOT
(
max(total) for BoundaryWallAvailable in ([1],[0])
) as Pvt
order by District
P.S: BoundaryWall is one column through pivoting i am breaking it into Yes and No Column

how to write a query to get multilevel data

I have four tables as below:
tblAccount
Id i sprimary key
+----+-----------------+
| Id | AccName |
+----+-----------------+
| 1 | AccountA |
| 2 | AccountB |
+----+-----------------+
tblLocation
Id is primary key.
+----+---------------+
| Id | LocName |
+----+---------------+
| 1 | LocationA |
| 2 | LocationB |
| 3 | LocationC |
+----+---------------+
tblAccountwiseLocation
Id i sprimary key.LocId and AccId are foreign key.
+----+---------------+---------------+
| Id | LocId | AccId |
+----+---------------+---------------+
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 3 | 1 |
| 4 | 1 | 2 |
| 5 | 2 | 2 |
| 6 | 3 | 2 |
+----+---------------+---------------+
tblRSCMaster
Id i sprimary key.LocId and AccId are foreign key.
+----+---------------+---------------+----------------+------------------+
| Id | LocId | AccId | RSCNo | DateOfAddition |
+----+---------------+---------------+----------------+------------------+
| 1 | 1 | 1 | Acc1_Loc1_1_14 | 15/01/2014 |
| 2 | 2 | 1 | Acc1_Loc2_1_14 | 15/01/2014 |
| 3 | 3 | 1 | Acc1_Loc2_1_14 | 15/01/2014 |
| 4 | 1 | 2 | Acc2_Loc1_1_14 | 15/01/2014 |
| 5 | 2 | 2 | Acc2_Loc2_1_14 | 15/01/2014 |
| 6 | 3 | 2 | Acc2_Loc3_1_14 | 15/01/2014 |
| 7 | 1 | 1 | Acc1_Loc1_2_14 | 15/02/2014 |
| 8 | 2 | 1 | Acc1_Loc2_2_14 | 15/02/2014 |
| 9 | 3 | 1 | Acc1_Loc3_2_14 | 15/02/2014 |
| 10 | 1 | 2 | Acc2_Loc1_2_14 | 15/02/2014 |
| 11 | 2 | 2 | Acc2_Loc2_2_14 | 15/02/2014 |
| 12 | 3 | 2 | Acc2_Loc3_2_14 | 15/02/2014 |
| 13 | 1 | 1 | Acc1_Loc1_3_14 | 15/03/2014 |
| 14 | 2 | 1 | Acc1_Loc2_3_14 | 15/03/2014 |
| 15 | 3 | 1 | Acc1_Loc3_3_14 | 15/03/2014 |
| 16 | 1 | 2 | Acc2_Loc1_3_14 | 15/03/2014 |
| 17 | 2 | 2 | Acc2_Loc2_3_14 | 15/03/2014 |
| 18 | 3 | 2 | Acc2_Loc3_3_14 | 15/03/2014 |
| 19 | 1 | 1 | Acc1_Loc1_4_14 | 15/04/2014 |
| 20 | 2 | 1 | Acc1_Loc2_4_14 | 15/04/2014 |
| 21 | 3 | 1 | Acc1_Loc3_4_14 | 15/04/2014 |
| 22 | 1 | 2 | Acc2_Loc1_4_14 | 15/04/2014 |
| 23 | 2 | 2 | Acc2_Loc2_4_14 | 15/04/2014 |
| 24 | 3 | 2 | Acc2_Loc3_4_14 | 15/04/2014 |
| 25 | 1 | 1 | Acc1_Loc1_5_14 | 15/05/2014 |
| 26 | 2 | 1 | Acc1_Loc2_5_14 | 15/05/2014 |
| 27 | 3 | 1 | Acc1_Loc3_5_14 | 15/05/2014 |
| 28 | 1 | 2 | Acc2_Loc1_5_14 | 15/05/2014 |
| 29 | 2 | 2 | Acc2_Loc2_5_14 | 15/05/2014 |
| 30 | 3 | 2 | Acc2_Loc3_5_14 | 15/05/2014 |
+----+---------------+---------------+----------------+------------------+
Acc1_Loc1_1_14 resembles RSC for LocationA of AccountA for Jan 2014.
I need to get a output as below from tblRSCMaster.
+---------------+---------------+----------------+------------------+
| LocId | AccId | RSCNo | DateOfAddition |
+---------------+---------------+----------------+------------------+
| 1 | 1 | Acc1_Loc1_3_14 | 15/03/2014 |
| 1 | 1 | Acc1_Loc1_4_14 | 15/04/2014 |
| 1 | 1 | Acc1_Loc1_5_14 | 15/05/2014 |
| 2 | 1 | Acc1_Loc2_3_14 | 15/03/2014 |
| 2 | 1 | Acc1_Loc2_4_14 | 15/04/2014 |
| 2 | 1 | Acc1_Loc2_5_14 | 15/05/2014 |
| 3 | 1 | Acc1_Loc3_3_14 | 15/03/2014 |
| 3 | 1 | Acc1_Loc3_4_14 | 15/04/2014 |
| 3 | 1 | Acc1_Loc3_5_14 | 15/05/2014 |
+---------------+---------------+----------------+------------------+
Each account has multiple locations and each location has multiple RSCs.
I need to get last three RSCs for each location for AccountA.
I have tried the below query:
SELECT tblAccountwiseLocation.LocId,tblAccountwiseLocation.AccId,tblRSCMaster.RSCNo,tblRSCMaster.DateOfAddition FROM tblAccountwiseLocation
INNER JOIN tblRSCMaster ON tblAccountwiseLocation.LocId= tblRSCMaster.LocId
where tblRSCMaster.AccId=1
But not getting the proper output.
Please help me out.
Thank you all in advance.

You can wrap the existing query inside a common table expression, and use ROW_NUMBER() to get only the last 3 (by tblRSCMaster.DateOfAddition) entries per tblAccountwiseLocation.LocId.
WITH cte AS (
SELECT tblAccountwiseLocation.LocId,
tblAccountwiseLocation.AccId,
tblRSCMaster.RSCNo,
tblRSCMaster.DateOfAddition,
ROW_NUMBER() OVER (PARTITION BY tblAccountwiseLocation.LocId
ORDER BY tblRSCMaster.DateOfAddition DESC) rn
FROM tblAccountwiseLocation
INNER JOIN tblRSCMaster
ON tblAccountwiseLocation.LocId = tblRSCMaster.LocId
AND tblAccountwiseLocation.AccId = tblRSCMaster.AccId
WHERE tblRSCMaster.AccId=1
)
SELECT LocId, AccId, RSCNo, DateOfAddition
FROM cte
WHERE rn <= 3
ORDER BY LocId, AccId, DateOfAddition
An SQLfiddle to test with.

Is this what you need?
select m.*
from (select m.*, row_number() over (partition by accID
order by DateOfAddition desc) as seqnum
from tblRSCMaster
where m.locid = 1
) m
where seqnum <= 3
order by AccId, DateOfAddition;
I think you need to filter on the locid rather than on the AccId to get what you want.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Omit Duplicate Only When In Sequential Order - sql

Related

How I can I add a count to rank null values in SQL Hive?

SQL Server : sort ranking list

PostgreSQL: Count number of rows in table 1 for distinct rows in table 2

SQL subcategory total is not properly placed at the end of every parent category

how to write a query to get multilevel data

Categories

Resources