Getting two columns one containing and one not containing a grouped value

Getting two columns one containing and one not containing a grouped value - sql

My data looks like this -
+-----------+-----------+-----------+----------+
| FLIGHT_NO | FL_DATE | SERIAL_NO | PILOT_NO |
+-----------+-----------+-----------+----------+
| 501 | 15-OCT-19 | 456710 | 345 |
| 521 | 16-OCT-19 | 562911 | 345 |
| 534 | 17-OCT-19 | 877694 | 345 |
| 577 | 17-OCT-19 | 338157 | 345 |
| 501 | 14-OCT-19 | 921225 | 346 |
| 534 | 15-OCT-19 | 877694 | 346 |
| 534 | 14-OCT-19 | 338157 | 347 |
| 590 | 16-OCT-19 | 650012 | 347 |
| 531 | 14-OCT-19 | 562911 | 348 |
| 531 | 15-OCT-19 | 562911 | 348 |
| 501 | 16-OCT-19 | 220989 | 349 |
| 521 | 18-OCT-19 | 650012 | 349 |
| 590 | 14-OCT-19 | 562911 | 351 |
| 577 | 18-OCT-19 | 877694 | 351 |
| 590 | 18-OCT-19 | 456710 | 346 |
+-----------+-----------+-----------+----------+
My aim is to return the total number of flights flying and not flying on 18-oct-19.
I'm doing it with dual but that doesn't seem to be the correct/best method.
Can anyone help me do it the correct way?
SELECT
(SELECT COUNT(FLIGHT_NO) NO_FLY FROM schd_flight WHERE fl_date = '18-OCT-19') AS FLY,
(SELECT COUNT(FLIGHT_NO) NO_FLY FROM schd_flight WHERE fl_date <> '18-OCT-19') AS NO_FLY
FROM dual;
My output -
+-----+--------+
| fly | no_fly |
+-----+--------+
| 3 | 12 |
+-----+--------+

Simply use sum with case statement
Select
sum(case when fl_date = '18-OCT-19' then 1 end) fly,
sum(case when fl_date <> '18-OCT-19' then 1 end) no_fly
From schd_flight;
Cheers!!

I think the second query is not necessary, no_fly = total - fly.
So I came up with my solution, may improve the query time :
SELECT sub.FLY as FLY, (SELECT count(*) from schd_flight) - sub.FLY as NO_FLY
FROM (
SELECT COUNT(CASE when fl_date = '18-OCT-19' then 1 end) AS FLY
from schd_flight
) sub;
Not tested yet though.

Related

Making rank() ignore null cases

I'm trying to figure out a way to make the rank() window function "skip" in its counting some rows with null values in a specific column. Pretty much, what I want is to count how many paid transactions there are, for each client, before each transaction/row.
I tried using case when inside the rank() and I got something similar to the results I expect, but still not quite what I need.
+-------------------------------------------------------+
| What I need |
+-------------+------+----------+-----------------------+
| CLIENT | CODE | PAYMENT | PAID_PURCHASES_SO_FAR |
| A | 341 | 17/09/21 | 0 |
| A | 342 | 18/09/21 | 1 |
| A | 343 | (null) | 2 |
| A | 344 | 18/09/21 | 2 |
| A | 345 | 19/09/21 | 3 |
| A | 346 | 19/09/21 | 4 |
| A | 347 | (null) | 5 |
| A | 348 | 24/09/21 | 5 |
| B | 855 | (null) | 0 |
| B | 856 | 20/09/21 | 0 |
| B | 857 | (null) | 1 |
+-------------+------+----------+-----------------------+
-+------------------------------------------------------+
| What I got |
-+------------+------+----------+-----------------------+
| CLIENT | CODE | PAYMENT | PAID_PURCHASES_SO_FAR |
| A | 341 | 17/09/21 | 0 |
| A | 342 | 18/09/22 | 1 |
| A | 343 | (null) | (null) |
| A | 344 | 18/09/22 | 2 |
| A | 345 | 19/09/22 | 3 |
| A | 346 | 19/09/21 | 4 |
| A | 347 | (null) | (null) |
| A | 348 | 24/09/21 | 5 |
| B | 855 | (null) | (null) |
| B | 856 | 20/09/21 | 0 |
| B | 857 | (null) | (null) |
-+------------+------+----------+-----------------------+
In a single image: comparison
And here my code:
SELECT
CLIENT
, CODE
, PAYMENT
, CASE WHEN PAYMENT IS NOT NULL THEN DENSE_RANK() OVER(PARTITION BY CLIENT, (CASE WHEN PAYMENT IS NOT NULL THEN 1 ELSE 0 END) ORDER BY CODE) - 1 END NUMBER_OF_PURCHASES_SO_FAR
FROM FOO.BAR
Note: The CODE column may be used as time reference. E.g. code = 750 came before code = 751, and so on.
Any help would be appreciated.
Thanks in advance.

You can use conditional aggregation combined with a window frame, as in:
select *, coalesce(sum(case when payment is null then 0 else 1 end)
over(partition by client order by code
rows between unbounded preceding and 1 preceding), 0)
as ppsf
from t
order by client, code
Result:
client code payment ppsf
------- ----- --------- ----
A 341 17/09/21 0
A 342 18/09/21 1
A 343 null 2
A 344 18/09/21 2
A 345 19/09/21 3
A 346 19/09/21 4
A 347 null 5
A 348 24/09/21 5
B 855 null 0
B 856 20/09/21 0
B 857 null 1
See running example at db<>fiddle.

This is it:
SELECT
"CLIENT"
, "CODE"
, "PAYMENT"
, rank() over(partition by "CLIENT"
order by COALESCE("PAYMENT",'01/01/70') )
FROM Table1
http://sqlfiddle.com/#!15/f941a/14

In database my query getting first row When i grouped

SELECT
EVENT_ID, COUNT(*), SEQUENCE_NBR
FROM
ALERTS
WHERE
ACKNOWLEDGED = 0
AND SRC_EXT = '7878'
GROUP BY
EVENT_ID
ORDER BY
COUNT(*) DESC;
Running this query to get id,count and sequence number. by group by event id and count i am getting sequence number first row .
+----------+--------------+
| EVENT_ID | SEQUENCE_NBR |
+----------+--------------+
| 150 | 9752 |
| 150 | 9764 |
| 150 | 9775 |
| 170 | 9755 |
| 170 | 9763 |
| 170 | 9774 |
| 217 | 9748 |
| 217 | 9759 |
| 217 | 9770 |
| 218 | 9751 |
| 218 | 9762 |
| 218 | 9773 |
| 273 | 9749 |
| 273 | 9760 |
| 273 | 9771 |
| 285 | 9750 |
| 285 | 9761 |
| 285 | 9772 |
+----------+--------------+
This is my data in db by using above query
+----------+----------+--------------+
| EVENT_ID | COUNT(*) | SEQUENCE_NBR |
+----------+----------+--------------+
| 150 | 3 | 9752 |
| 170 | 3 | 9755 |
| 217 | 3 | 9748 |
| 218 | 3 | 9751 |
| 273 | 3 | 9749 |
| 285 | 3 | 9750 |
+----------+----------+--------------+
i need data in same format with seuence number should
150 | 3 | 9775

Your query is malformed. You have SEQUENCE_NBR in the SELECT, but it is not in the GROUP BY. In most databases (including the more recent versions of MySQL), this generates an error. Happily that is so.
If you want the maximum SEQUENCE_NBR, then use the MAX() function:
SELECT EVENT_ID, COUNT(*), MAX(SEQUENCE_NBR) as SEQUENCE_NBR
FROM ALERTS
WHERE ACKNOWLEDGED = 0 AND
SRC_EXT = '7878'
GROUP BY EVENT_ID
ORDER BY COUNT(*) DESC;

you can try like below
SELECT
EVENT_ID, COUNT(*), SEQUENCE_NBR
FROM
ALERTS
WHERE
ACKNOWLEDGED = 0
AND SRC_EXT = '7878'
GROUP BY
EVENT_ID
ORDER BY
COUNT(*) DESC,SEQUENCE_NBR desc;

Selecting the first instance of a vendor, part combination

I am trying to create an indicator for if a particular transaction was the first time a part was purchased from a particular vendor.
I have a dataset that looks like this:
| transaction_id | vendor_id | part_id | trans_date |
|:--------------:|:---------:|:-------:|:-----------------:|
| 9Bx*2Pc' | a | 873 | 10/12/2018 |
| 1Po.4Ot, | a | 473 | 4/22/2016 |
| 9Sk"7Kv/ | b | 123 | 7/23/2016 |
| 2Lz&7Hu& | a | 873 | 12/20/2017 |
| 8Lz)5Is# | b | 743 | 10/22/2016 |
| 5Sc'6Jl/ | a | 113 | 10/6/2016 |
| 0Ra&8Hb& | a | 653 | 10/4/2017 |
| 4Wc-8Of* | c | 333 | 8/3/2017 |
| 8Vv+9Yo/ | c | 333 | 12/7/2016 |
| 6Qh!1Ha- | c | 333 | 3/28/2017 |
| 2Ol%4Rs# | c | 333 | 5/2/2017 |
| 1Gg#8Cm% | c | 333 | 11/15/2016 |
| 0Lw(6Pv/ | d | 873 | 8/13/2017 |
| 1Gy/7Zw, | a | 443 | 10/12/2018 |
| 2Gz,4Gp. | b | 103 | 1/5/2018 |
| 5Dj)6Wc+ | a | 893 | 12/17/2016 |
| 5Hl-8Ds! | a | 903 | 12/8/2017 |
| 8Ws$3Vy* | b | 873 | 1/13/2018 |
What I am looking to do is determine if the transaction_id was the first time (sorted by trans_date), that the part_id was purchased from a vendor_id. I would imagine the ideal output to look like this:
| transaction_id | vendor_id | part_id | trans_date | first_time |
|:--------------:|:---------:|:-------:|:-----------------:|:----------:|
| 9Bx*2Pc' | a | 873 | 10/12/2018 | N |
| 1Po.4Ot, | a | 473 | 4/22/2016 | Y |
| 9Sk"7Kv/ | b | 123 | 7/23/2016 | Y |
| 2Lz&7Hu& | a | 873 | 12/20/2017 | Y |
| 8Lz)5Is# | b | 743 | 10/22/2016 | Y |
| 5Sc'6Jl/ | a | 113 | 10/6/2016 | Y |
| 0Ra&8Hb& | a | 653 | 10/4/2017 | Y |
| 4Wc-8Of* | c | 333 | 8/3/2017 | N |
| 8Vv+9Yo/ | c | 333 | 12/7/2016 | N |
| 6Qh!1Ha- | c | 333 | 3/28/2017 | N |
| 2Ol%4Rs# | c | 333 | 5/2/2017 | N |
| 1Gg#8Cm% | c | 333 | 11/15/2016 | Y |
| 0Lw(6Pv/ | d | 873 | 8/13/2017 | Y |
| 1Gy/7Zw, | a | 443 | 10/12/2018 | Y |
| 2Gz,4Gp. | b | 103 | 1/5/2018 | Y |
| 5Dj)6Wc+ | a | 893 | 12/17/2016 | Y |
| 5Hl-8Ds! | a | 903 | 12/8/2017 | Y |
| 8Ws$3Vy* | b | 873 | 1/13/2018 | Y |
So far, I have tried (which was influenced by this post):
WITH
first_instance AS (
SELECT
tbl_trans.*,
ROW_NUMBER() OVER (PARTITION BY vendor_id||part_id ORDER BY trans_date) AS row_nums
FROM
tbl_trans
)
SELECT
x.*,
CASE WHEN y.row_nums = 1 THEN 'Y' ELSE 'N' END AS first_time_indicator
FROM
tbl_trans x
LEFT JOIN first_instance y
But I am met with:
ORA-00905: missing keyword
I have created a SQL FIDDLE with this data and the query thus far for testing. How can I determine the if a transaction was a first time purchase for a part/vendor combination?

Use window functions:
select t.*,
(case when row_number() over (partition by vendor_id, part_id order by trans_date) = 1
then 'Y' else 'N'
end) as first_time
from tbl_trans t;
You don't need a join.

Apart from row_number, there are multiple ways of achieving the desired result using analytical function as follows.
You can use first_value analytical function as follows:
Select t.*,
Case
when first_value(trans_date)
over (partition by vendor_id, part_id order by trans_date) = trans_date
then 'Y'
else 'N'
end as first_time
From your_table t;
The same way, you can also use min as follows:
Select t.*,
Case
when min(trans_date)
over (partition by vendor_id, part_id) = trans_date
then 'Y'
else 'N'
end as first_time
From your_table t;

SQL subcategory total is not properly placed at the end of every parent category

I am having trouble in SQl query,The query result should be like this
+------------+------------+-----+------+-------+--+--+--+
| District | Tehsil | yes | no | Total | | | |
+------------+------------+-----+------+-------+--+--+--+
| ABBOTTABAD | ABBOTTABAD | 377 | 5927 | 6304 | | | |
| ABBOTTABAD | HAVELIAN | 112 | 2276 | 2388 | | | |
| ABBOTTABAD | Overall | 489 | 8203 | 8692 | | | |
| CHARSADDA | CHARSADDA | 289 | 3762 | 4051 | | | |
| CHARSADDA | SHABQADAR | 121 | 1376 | 1497 | | | |
| CHARSADDA | TANGI | 94 | 1703 | 1797 | | | |
| CHARSADDA | Overall | 504 | 6841 | 7345 | | | |
+------------+------------+-----+------+-------+--+--+--+
The overall total should be should be shown at the end of every parent category but now it is showing like this
+------------+------------+-----+------+-------+--+--+--+
| District | Tehsil | yes | no | Total | | | |
+------------+------------+-----+------+-------+--+--+--+
| ABBOTTABAD | ABBOTTABAD | 377 | 5927 | 6304 | | | |
| ABBOTTABAD | HAVELIAN | 112 | 2276 | 2388 | | | |
| ABBOTTABAD | Overall | 489 | 8203 | 8692 | | | |
| CHARSADDA | CHARSADDA | 289 | 3762 | 4051 | | | |
| CHARSADDA | Overall | 504 | 6841 | 7345 | | | |
| CHARSADDA | SHABQADAR | 121 | 1376 | 1497 | | | |
| CHARSADDA | TANGI | 94 | 1703 | 1797 | | | |
+------------+------------+-----+------+-------+--+--+--+
My query is sorting second column with respect to first column although order by query is applied on my first column. This is my query
select District as 'District', tName as 'tehsil',[1] as 'yes',[0] as 'no',ISNULL([1]+[0], 0) as "Total" from
(
select d.Name as 'District',
case when grouping (t.Name)=1 then 'Overall' else t.Name end as tName,
BoundaryWallAvailable,
count(*) as total from School s
INNER JOIN SchoolIndicator i ON (i.refSchoolID=s.SchoolID)
INNER JOIN Tehsil t ON (t.TehsilID=s.refTehsilID)
INNER JOIN district d ON (d.DistrictID=t.refDistrictID)
group by
GROUPING sets((d.Name, BoundaryWallAvailable), (d.Name,t.Name, BoundaryWallAvailable))
) B
PIVOT
(
max(total) for BoundaryWallAvailable in ([1],[0])
) as Pvt
order by District
P.S: BoundaryWall is one column through pivoting i am breaking it into Yes and No Column

equating an entry to an aggregated version of itself

I am trying to find if an entry's value is the max of the grouped value. Its purpose is to sit in a larger if logic.
Which I'd expect would look something like this:
SELECT
t.id as t_id,
sum(if(t.value = max(t.value), 1, 0)) AS is_max_value
FROM dataset.table AS t
GROUP BY t_id
The response is:
Error: Expression 't.value' is not present in the GROUP BY list
How should my code look to do this?

You first need to compile in a subquery the max value, then join again the value to the table.
Using the public data set available here is an example:
SELECT
t.word,
t.word_count,
t.corpus_date
FROM
[publicdata:samples.shakespeare] t
JOIN (
SELECT
corpus_date,
MAX(word_count) word_count,
FROM
[publicdata:samples.shakespeare]
GROUP BY
1 ) d
ON
d.corpus_date=t.corpus_date
AND t.word_count=d.word_count
LIMIT
25
Results:
+-----+--------+--------------+---------------+---+
| Row | t_word | t_word_count | t_corpus_date | |
+-----+--------+--------------+---------------+---+
| 1 | the | 762 | 1597 | |
| 2 | the | 894 | 1598 | |
| 3 | the | 841 | 1590 | |
| 4 | the | 680 | 1606 | |
| 5 | the | 942 | 1607 | |
| 6 | the | 779 | 1609 | |
| 7 | the | 995 | 1600 | |
| 8 | the | 937 | 1599 | |
| 9 | the | 738 | 1612 | |
| 10 | the | 612 | 1595 | |
| 11 | the | 848 | 1592 | |
| 12 | the | 753 | 1594 | |
| 13 | the | 740 | 1596 | |
| 14 | I | 828 | 1603 | |
| 15 | the | 525 | 1608 | |
| 16 | the | 363 | 0 | |
| 17 | I | 629 | 1593 | |
| 18 | I | 447 | 1611 | |
| 19 | the | 715 | 1602 | |
| 20 | the | 717 | 1610 | |
+-----+--------+--------------+---------------+---+
You can see that retains the word that have the maximum word_count in the partition defined by corpus_date

Use window function to "spread" the max value over all relevant records.
this way you can avoid the Join.
SELECT
*
FROM (
SELECT
corpus,
corpus_date,
word,
word_count,
MAX(word_count) OVER (PARTITION BY corpus) AS Max_Word_Count
FROM
[publicdata:samples.shakespeare] )
WHERE
word_count=Max_Word_Count

select
id,
value,
integer(value = max_value) as is_max_value
from (
select id, value, max(value) over(partition by id) as max_value
from dataset.table
)
Explanation:
Inner select - for each row/record calculates max of value among all rows with the same id
Outer select - for each row/record compares row's value with max value for respective group and then converts true or false into respectively 1 or 0 (as per expectation in question)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Getting two columns one containing and one not containing a grouped value - sql

Simply use sum with case statement Select sum(case when fl_date = '18-OCT-19' then 1 end) fly, sum(case when fl_date <> '18-OCT-19' then 1 end) no_fly From schd_flight; Cheers!!

Related

Making rank() ignore null cases

In database my query getting first row When i grouped

Selecting the first instance of a vendor, part combination

SQL subcategory total is not properly placed at the end of every parent category

equating an entry to an aggregated version of itself

Categories

Resources