I need to display '0' value in Avg Delay Departure table based on STATION in Schedule table.
Here's the Schedule table,
+---------+---------+----------+
| Station | On Time | Schedule |
+---------+---------+----------+
| AMQ | 174 | 202 |
| AMS | 21 | 27 |
| BDJ | 182 | 210 |
| BDO | 56 | 62 |
| BEJ | 59 | 62 |
| BIK | 74 | 93 |
| BKK | 81 | 87 |
| BKS | 73 | 87 |
| BMU | 60 | 60 |
| BOM | 2 | 7 |
| BPN | 413 | 452 |
+---------+---------+----------+
Here's the Avg Delay table,
+---------+---------------------+
| Station | Avg Delay Departure |
+---------+---------------------+
| AMQ | 53.21 |
| AMS | 49.5 |
| BDJ | 60.78 |
| BDO | 67.66 |
| BEJ | 46.33 |
| BIK | 47.53 |
| BKK | 55.5 |
| BKS | 67.56 |
| BOM | 45.2 |
| BPN | 53.81 |
+---------+---------------------+
Pay attention to BMU record in Schedule table. It has 60 schedules and 60 on time so there's no delay. I want to display the BMU record in Avg Delay table with value of '0' for Avg Delay Departure column. My current query don't display that.
Here's the query for Avg Delay table,
SELECT DEPAIRPORT AS STATION, to_number(to_char(trunc(sysdate) + avg(cast(ACTUAL_BLOCKOFF_LC as date) - cast(SCHEDULED_DEPDT_LC as date)), 'sssss'))/60 as DEPAVERAGE
FROM DBODSXML4OPS.XML4OPS
WHERE ACTUAL_BLOCKOFF_LC IS NOT NULL AND SERVICETYPE IN ('J','G') AND (ACTUAL_BLOCKOFF_LC - SCHEDULED_DEPDT_LC)*24*60 > '+000000015 00:00:00.000000000'
AND STATUS IN ('Scheduled') AND
TO_CHAR(SCHEDULED_DEPDT_LC, 'yyyy-mm-dd') BETWEEN '2018-04-14' AND '2018-05-14'
GROUP BY DEPAIRPORT
ORDER BY STATION ASC;
Related
My PostgreSQL database stores school vacation, public holidays and weekend dates for parents to plan their vacation. Many times school vacations are adjourned by weekends or public holidays. I want to display the total number of non-school days for a school vacation. That should include any adjourned weekend or public holiday.
Example Data
locations
SELECT id, name, is_federal_state
FROM locations
WHERE is_federal_state = true;
| id | name | is_federal_state |
|----|-------------------|------------------|
| 2 | Baden-Württemberg | true |
| 3 | Bayern | true |
holiday_or_vacation_types
SELECT id, name FROM holiday_or_vacation_types;
| id | name |
|----|-----------------------|
| 1 | Herbst |
| 8 | Wochenende |
"Herbst" is German for "autumn" and "Wochenende" is German for "weekend".
periods
SELECT id, starts_on, ends_on, holiday_or_vacation_type_id
FROM periods
WHERE location_id = 2
ORDER BY starts_on;
| id | starts_on | ends_on | holiday_or_vacation_type_id |
|-----|--------------|--------------|-----------------------------|
| 670 | "2019-10-26" | "2019-10-27" | 8 |
| 532 | "2019-10-28" | "2019-10-30" | 1 |
| 533 | "2019-10-31" | "2019-10-31" | 1 |
| 671 | "2019-11-02" | "2019-11-03" | 8 |
| 672 | "2019-11-09" | "2019-11-10" | 8 |
| 673 | "2019-11-16" | "2019-11-17" | 8 |
Task
I want to select all periods where location_id equals 2. And I want to calculate the duration of each period in days. That can be done with this SQL query:
SELECT id, starts_on, ends_on,
(ends_on - starts_on + 1) AS duration,
holiday_or_vacation_type_id
FROM periods
| id | starts_on | ends_on | duration | holiday_or_vacation_type_id |
|-----|--------------|--------------|----------|-----------------------------|
| 670 | "2019-10-26" | "2019-10-27" | 2 | 8 |
| 532 | "2019-10-28" | "2019-10-30" | 3 | 1 |
| 533 | "2019-10-31" | "2019-10-31" | 1 | 1 |
| 671 | "2019-11-02" | "2019-11-03" | 2 | 8 |
| 672 | "2019-11-09" | "2019-11-10" | 2 | 8 |
| 673 | "2019-11-16" | "2019-11-17" | 2 | 8 |
Any human looking at the calendar would see that the ids 670 (weekend), 532 (fall vacation) and 533 (fall vacation) are adjourned. So they add up to a 6 day vacation period. So far I do this with a program which computes this. But that takes quite a lot of resources (the actual table contains some 500,000 items).
Problem 1
Which SQL query would result in the following output (is adds a real_duration column)? Is that even possible with SQL?
| id | starts_on | ends_on | duration | real_duration | holiday_or_vacation_type_id |
|-----|--------------|--------------|----------|---------------|-----------------------------|
| 670 | "2019-10-26" | "2019-10-27" | 2 | 6 | 8 |
| 532 | "2019-10-28" | "2019-10-30" | 3 | 6 | 1 |
| 533 | "2019-10-31" | "2019-10-31" | 1 | 6 | 1 |
| 671 | "2019-11-02" | "2019-11-03" | 2 | 2 | 8 |
| 672 | "2019-11-09" | "2019-11-10" | 2 | 2 | 8 |
| 673 | "2019-11-16" | "2019-11-17" | 2 | 2 | 8 |
Problem 2
It is possible to list the adjourning periods in a part_of_range field? This would be the result. Can that be done with SQL?
| id | starts_on | ends_on | duration | part_of_range | holiday_or_vacation_type_id |
|-----|--------------|--------------|----------|---------------|-----------------------------|
| 670 | "2019-10-26" | "2019-10-27" | 2 | 670,532,533 | 8 |
| 532 | "2019-10-28" | "2019-10-30" | 3 | 670,532,533 | 1 |
| 533 | "2019-10-31" | "2019-10-31" | 1 | 670,532,533 | 1 |
| 671 | "2019-11-02" | "2019-11-03" | 2 | | 8 |
| 672 | "2019-11-09" | "2019-11-10" | 2 | | 8 |
| 673 | "2019-11-16" | "2019-11-17" | 2 | | 8 |
This is a gaps and islands problem. In this case you can use lag() to see where an island starts and then a cumulative sum.
The final operation is some aggregation (using window functions):
SELECT p.*,
(Max(ends_on) OVER (PARTITION BY location_id, grp) - Min(starts_on) OVER (PARTITION BY location_id, grp) ) + 1 AS duration,
Array_agg(p.id) OVER (PARTITION BY location_id)
FROM (SELECT p.*,
Count(*) FILTER (WHERE prev_eo < starts_on - INTERVAL '1 day') OVER (PARTITION BY location_id ORDER BY starts_on) AS grp
FROM (SELECT id, starts_on, ends_on, location_id, holiday_or_vacation_type_id,
lag(ends_on) OVER (PARTITION BY location_id ORDER BY (starts_on)) AS prev_eo
FROM periods
) p
) p;
I am working with really big data that at the moment I become confused, looking like I'm just repeating one thing.
I want to count the number of trips per user from two tables, trips and session.
psql=> SELECT * FROM trips limit 10;
trip_id | session_ids | daily_user_id | seconds_start | seconds_end
---------+-----------------+---------------+---------------+-------------
400543 | {172079} | 17118 | 1575550944 | 1575551181
400542 | {172078} | 17118 | 1575541533 | 1575542171
400540 | {172077} | 17118 | 1575539001 | 1575539340
400538 | {172076} | 17117 | 1575540499 | 1575541999
400534 | {172074,172075} | 17117 | 1575537161 | 1575539711
400530 | {172073} | 17116 | 1575447043 | 1575447682
400529 | {172071} | 17115 | 1575496394 | 1575497803
400527 | {172070} | 17113 | 1575495241 | 1575496034
400525 | {172068} | 17115 | 1575485658 | 1575489378
400524 | {172067} | 17113 | 1575488721 | 1575490491
(10 rows)
psql=> SELECT * FROM session limit 10;
session_id | user_id | key | start_time | daily_user_id
------------+---------+--------------------------+------------+---------------
172079 | 43 | hLB8S7aSfp4gAFp7TykwYQ==+| 1575550921 | 17118
| | | |
172078 | 43 | YATMrL/AQ7Nu5q2dQTMT1A==+| 1575541530 | 17118
| | | |
172077 | 43 | fOLX4tqvsyFOP3DCyBZf1A==+| 1575538997 | 17118
| | | |
172076 | 7 | 88hwGj4Mqa58juy0PG/R4A==+| 1575540515 | 17117
| | | |
172075 | 7 | 1O+8X49+YbtmoEa9BlY5OQ==+| 1575538384 | 17117
| | | |
172074 | 7 | XOR7hsFCNk+soM75ZhDJyA==+| 1575537405 | 17117
| | | |
172073 | 42 | rAQWwYgqg3UMTpsBYSpIpA==+| 1575447109 | 17116
| | | |
172072 | 276 | 0xOsxRRN3Sq20VsXWjlrzQ==+| 1575511120 | 17114
| | | |
172071 | 7 | P4beN3W/ZrD+TCpZGYh23g==+| 1575496642 | 17115
| | | |
172070 | 43 | OFi30Zv9e5gmLZS5Vb+I7Q==+| 1575495238 | 17113
| | | |
(10 rows)
Goal: get the distribution of trips per user
Attempt:
psql=> SELECT COUNT(distinct trip_id) as trips
, count(distinct user_id) as users
, extract(year from to_timestamp(seconds_start)) as year_date
, extract(month from to_timestamp(seconds_start)) as month_date
FROM trips
INNER JOIN session
ON session_id = ANY(session_ids)
GROUP BY year_date, month_date
ORDER BY year_date, month_date;
+-------+-------+-----------+------------+
| trips | users | year_date | month_date |
+-------+-------+-----------+------------+
| 371 | 44 | 2016 | 3 |
| 12207 | 185 | 2016 | 4 |
| 3859 | 88 | 2016 | 5 |
| 1547 | 28 | 2016 | 6 |
| 831 | 17 | 2016 | 7 |
| 427 | 4 | 2016 | 8 |
| 512 | 13 | 2016 | 9 |
| 431 | 11 | 2016 | 10 |
| 1011 | 26 | 2016 | 11 |
| 791 | 15 | 2016 | 12 |
| 217 | 8 | 2017 | 1 |
| 490 | 17 | 2017 | 2 |
| 851 | 18 | 2017 | 3 |
| 1890 | 66 | 2017 | 4 |
| 2143 | 43 | 2017 | 5 |
| . | | | |
| . | | | |
| . | | | |
+-------+-------+-----------+------------+
This resultset count number of users and trips, my intention is actually to get an analysis of trips per user, like so:
+------+-------------+
| user | no_of_trips |
+------+-------------+
| 1 | 489 |
| 2 | 400 |
| 3 | 12 |
| 4 | 102 |
| . | |
| . | |
| . | |
+------+-------------+
How do I do this, please?
You seem to just want aggregation by user_id:
SELECT s.user_id, COUNT(distinct t.trip_id) as trips
FROM trips t INNER JOIN
session s
ON s.session_id = ANY(t.session_ids)
GROUP BY s.user_id ;
I'm pretty sure that the COUNT(DISTINCT) is unnecessary, so I would advise removing it:
SELECT s.user_id, COUNT(*) as trips
FROM trips t INNER JOIN
session s
ON s.session_id = ANY(t.session_ids)
GROUP BY s.user_id ;
I am having trouble in SQl query,The query result should be like this
+------------+------------+-----+------+-------+--+--+--+
| District | Tehsil | yes | no | Total | | | |
+------------+------------+-----+------+-------+--+--+--+
| ABBOTTABAD | ABBOTTABAD | 377 | 5927 | 6304 | | | |
| ABBOTTABAD | HAVELIAN | 112 | 2276 | 2388 | | | |
| ABBOTTABAD | Overall | 489 | 8203 | 8692 | | | |
| CHARSADDA | CHARSADDA | 289 | 3762 | 4051 | | | |
| CHARSADDA | SHABQADAR | 121 | 1376 | 1497 | | | |
| CHARSADDA | TANGI | 94 | 1703 | 1797 | | | |
| CHARSADDA | Overall | 504 | 6841 | 7345 | | | |
+------------+------------+-----+------+-------+--+--+--+
The overall total should be should be shown at the end of every parent category but now it is showing like this
+------------+------------+-----+------+-------+--+--+--+
| District | Tehsil | yes | no | Total | | | |
+------------+------------+-----+------+-------+--+--+--+
| ABBOTTABAD | ABBOTTABAD | 377 | 5927 | 6304 | | | |
| ABBOTTABAD | HAVELIAN | 112 | 2276 | 2388 | | | |
| ABBOTTABAD | Overall | 489 | 8203 | 8692 | | | |
| CHARSADDA | CHARSADDA | 289 | 3762 | 4051 | | | |
| CHARSADDA | Overall | 504 | 6841 | 7345 | | | |
| CHARSADDA | SHABQADAR | 121 | 1376 | 1497 | | | |
| CHARSADDA | TANGI | 94 | 1703 | 1797 | | | |
+------------+------------+-----+------+-------+--+--+--+
My query is sorting second column with respect to first column although order by query is applied on my first column. This is my query
select District as 'District', tName as 'tehsil',[1] as 'yes',[0] as 'no',ISNULL([1]+[0], 0) as "Total" from
(
select d.Name as 'District',
case when grouping (t.Name)=1 then 'Overall' else t.Name end as tName,
BoundaryWallAvailable,
count(*) as total from School s
INNER JOIN SchoolIndicator i ON (i.refSchoolID=s.SchoolID)
INNER JOIN Tehsil t ON (t.TehsilID=s.refTehsilID)
INNER JOIN district d ON (d.DistrictID=t.refDistrictID)
group by
GROUPING sets((d.Name, BoundaryWallAvailable), (d.Name,t.Name, BoundaryWallAvailable))
) B
PIVOT
(
max(total) for BoundaryWallAvailable in ([1],[0])
) as Pvt
order by District
P.S: BoundaryWall is one column through pivoting i am breaking it into Yes and No Column
I have a MS Access view generating this result:
+-------+------------+-------+---------+--------+-------+
| Id | Date | Kind | Initial | Final | Total |
+-------+------------+-------+---------+--------+-------+
| 334AB | 01/04/2017 | Red | 199725 | 199789 | 64 |
| 334AB | 01/04/2017 | Green | 199789 | 199799 | 10 |
| 107AE | 01/04/2017 | Red | 73978 | 74074 | 96 |
| 107AE | 02/04/2017 | Green | 74074 | 74248 | 174 |
+-------+------------+-------+---------+--------+-------+
Generated with:
Group by ID, Date and Kind
Initial: Min(startKm)
Final: Max(endKm)
Total: Sum(Distance)
This is the query:
SELECT street.Id, street.Date, IIf(IsNull([agev]), Kind, Min(street.Initial) AS Iniziali, Max(street.Final) AS Finali, Sum(street.Distance) AS Total
FROM street
GROUP BY street.Id, street.Date, Kind
ORDER BY street.Date;
What I need is this result:
+-------+------------+---------+--------+----------+------------+-------+
| Id | Date | Initial | Final | TotalRed | TotalGreen | Total |
+-------+------------+---------+--------+----------+------------+-------+
| 334AB | 01/04/2017 | 199725 | 199799 | 64 | 10 | 74 |
| 107AE | 01/04/2017 | 73978 | 74074 | 96 | 0 | 96 |
| 107AE | 02/04/2017 | 74074 | 74248 | 0 | 174 | 174 |
+-------+------------+---------+--------+----------+------------+-------+
Where Initial is the lowest "initial" km in that day by that id
and Final is the higher "Final" km in that day by that id
What do you suggest?
thanks
should work out like this:
SELECT street.Id
,street.Date
,Min(street.Initial) AS Iniziali
,Max(street.Final) AS Finali
,SUM(IIF(street.Kind = 'Red',street.Distance,0)) AS TotalRed
,SUM(IIF(street.Kind = 'Green',street.Distance,0)) AS TotalGreen
,Sum(street.Distance) AS Total
FROM street
GROUP BY street.Id
,street.Date
ORDER BY street.Date;
I was asked to create a report (using Teradata SQL OLAP functions) as below
EMPL_ID | perd_end_d | pdct_I | Year to date sal Amnt | Diff in sale amnt from Prev month
-------------------------------------------------------------------------------------------
I was given the following "sales" dataset and I have to calculate "Year to date sale amount" and "difference in crrent and previous month's sale amount"
empl_id| perd_end_d | pdct_I|sale_amnt|
----------------------------------------
E1001 | 31-01-2010 | P2003 | 2,03 |
E1003 | 31-01-2010 | P2015 | 44 |
E1003 | 31-01-2010 | P2004 | 67,6 |
E1001 | 31-01-2010 | P2002 | 135 |
E1003 | 31-01-2010 | P2003 | 545 |
E1001 | 31-01-2010 | P2001 | 1,00 |
E1002 | 31-01-2010 | P2005 | 23 |
E1002 | 31-01-2010 | P2007 | 343 |
E1006 | 28-02-2010 | P2005 | 34 |
E1006 | 28-02-2010 | P2004 | 43 |
E1001 | 28-02-2010 | P2003 | 54 |
E1001 | 28-02-2010 | P2002 | 878 |
E1003 | 28-02-2010 | P2008 | 434 |
E1001 | 28-02-2010 | P2001 | 66 |
E1007 | 28-02-2010 | P2009 | 455 |
E1007 | 28-02-2010 | P2009 | 4,54 |
E1003 | 28-02-2010 | P2007 | 56 |
E1008 | 28-02-2010 | P2009 | 786 |
E1010 | 31-01-2011 | P2001 | 300 |
E1001 | 31-01-2011 | P2002 | 200 |
E1009 | 31-01-2011 | P2003 | 100 |
E1011 | 31-01-2012 | P2004 | 700 |
E1002 | 31-01-2012 | P2005 | 400 |
E1011 | 31-01-2012 | P2003 | 600 |
E1002 | 31-01-2012 | P2007 | 500 |
---------------------------------------
I want something like below
empl_id| perd_end_d | pdct_I|sale_amnt| diff(ur_mnt_sal - prev_mnt_sal)
-------------------------------------------------------------------------
E1001 | 31-01-2010 | P2003 | 2,03 | 203 -- or may be null
E1003 | 31-01-2010 | P2015 | 44 | 159
E1003 | 31-01-2010 | P2004 | 67,6 | 632
E1001 | 31-01-2010 | P2002 | 135 | 541
E1003 | 31-01-2010 | P2003 | 545 | 410
...
So far I managed to find the required result but it looks ugly, how can I improve the following solution.
SELECT perd_end_d
, pdct_I
, sale_amnt
, ABS( SUM(sale_amnt) over (partition by perd_end_d
order by perd_end_d
rows between 1 preceding and 1 preceding )
- SUM(sale_amnt) over (partition by perd_end_d
order by perd_end_d
rows current row ) )"prev_mnt_sal - cur_mnt_sal"
from sandbox.sales;
and the resultset is as following
SELECT perd_end_d
, pdct_I
, sale_amnt
, ABS( min(sale_amnt) over (partition by perd_end_d
order by perd_end_d
rows between 1 preceding and 1 preceding )
- sale_amnt) as "prev_mnt_sal - cur_mnt_sal"
from sandbox.sales;
To probably want something like this:
SELECT empl_id
, perd_end_d
, sum(sale_amnt) as sumsale
-- cumulative sum of sales per employee
, SUM(sumsale)
over (partition by empl_id
order by perd_end_d
rows unbounded preceding)
-- difference between current and previous month per employee
, sumsale -
SUM(sumsale)
over (partition by empl_id
order by perd_end_d
rows between 1 preceding and 1 preceding )
from sandbox.sales
group by 1,2;