SQLite3 table subtraction - sql

I have two tables; one is a reference table all_grid and the other has customer details on it t_customer.
I need to present the rows that are in the reference table but not in the customer table (i.e. show the rows where customer_x and customer_y columns are present in the all_grid but not in t_customer). Columns are named the same in both tables, but t_customer has an id column too.
Currently I've tried
SELECT customer_x, customer_y FROM all_grid
EXCEPT
SELECT customer_x, customer_y FROM t_customer;
but this seems to just show all rows in all_grid and I'm not sure which terminology to use for SQLite.
t_customer table is as follows:
1|35|24
2|-20|30
3|-10|-20
4|35|-46
5|4|-19
6|30|36
7|-12|-24
8|-12|-16
9|-17|-10
10|99|99
11|-4|-29
12|35|24
13|13|28
14|99|99
15|-24|-3
16|-49|-39
17|99|99
18|-48|-44
19|-46|35
20|-28|-47
21|99|99
22|99|99
23|31|22
24|4|14
25|5|6
26|32|24
27|-34|-4
28|29|25
29|-12|-31
30|99|99
31|-17|41
32|-20|-42
33|99|99
34|-4|40
and all_grid is all 100 possible mixes of customer_x and customer_y rounded down to the nearest 10, and (90, 90) included.

Apologies, I've realised that I did something stupid and didn't consider the fact that t_customer data was rounded up and down to 10 in prior calculations.
My final query was:
SELECT (customer_x * 10), (customer_y * 10) FROM all_grid
EXCEPT
SELECT 10 * (t.customer_x / 10), 10 * (t.customer_y / 10) FROM
(SELECT CASE WHEN customer_x < 0 THEN customer_x - 10 ELSE customer_x END AS customer_x,
CASE WHEN customer_y < 0 THEN customer_y - 10 ELSE customer_y END AS customer_y
FROM t_customer) t
and seems to do the trick.

Related

How to create new column based on arithmetic operation after doing GROUP BY

I have a syntax question. I was trying an exercise out of the book "Practical SQL" by Anthony DeBarros. I did the exercise, which involved creating two new columns that created percent changes based on four columns from two different data sets. The query below runs great:
SELECT pls14.stabr AS us_state,
sum(pls14.gpterms) AS terminals_2014,
sum(pls09.gpterms) AS terminals_2009,
round( (CAST(sum(pls14.gpterms) AS decimal(10,1)) - sum(pls09.gpterms)) /
sum(pls09.gpterms) * 100, 2 ) AS terminal_pct_change,
sum(pls14.pitusr) AS term_uses_2014,
sum(pls09.pitusr) AS term_uses_2009,
round( (CAST(sum(pls14.pitusr) AS decimal(10,1)) - sum(pls09.pitusr)) /
sum(pls09.pitusr) * 100, 2 ) AS term_uses_pct_change
FROM pls_fy2014_pupld14a pls14 JOIN pls_fy2009_pupld09a pls09
ON pls14.fscskey = pls09.fscskey
WHERE pls14.gpterms >= 0 AND pls09.gpterms >= 0
GROUP BY pls14.stabr
ORDER BY terminal_pct_change DESC;
It results in the following table:
us_state terminals_2014 terminals_2009 terminal_pct_change term_uses_2014 term_uses_2009 term_uses_pct_change
"GU" 547 59 827.12 39842 19564 103.65
"DC" 1000 594 68.35 1050623 140251 649.10
"AK" 994 618 60.84 771075 1061498 -27.36
"DE" 772 487 58.52 622515 451689 37.82
"ID" 1792 1151 55.69 1878131 1986141 -5.44
"CO" 6407 4172 53.57 7395748 7672580 -3.61
I was wanting to do add a column that would show me the simple difference between the terminal_pct_change and term_use_pct_change columns. I tried a lot of things and nothing has worked. I'm not sure where the insert the following line.
terminal_pct_change - term_uses_pct_change AS kept_up
I would love ideas to a) get this working, but more importantly b) to understand how the solution works.
Thanks!
Try this
select x.us_state,x.terminals_2014,x.terminals_2009,(x.terminal_pct_change - x.term_uses_pct_change) AS kept_up,term_uses_2014,term_uses_2009 from (
SELECT pls14.stabr AS us_state,
sum(pls14.gpterms) AS terminals_2014,
sum(pls09.gpterms) AS terminals_2009,
round( (CAST(sum(pls14.gpterms) AS decimal(10,1)) - sum(pls09.gpterms)) /
sum(pls09.gpterms) * 100, 2 ) AS terminal_pct_change,
sum(pls14.pitusr) AS term_uses_2014,
sum(pls09.pitusr) AS term_uses_2009,
round( (CAST(sum(pls14.pitusr) AS decimal(10,1)) - sum(pls09.pitusr)) /
sum(pls09.pitusr) * 100, 2 ) AS term_uses_pct_change
FROM pls_fy2014_pupld14a pls14 JOIN pls_fy2009_pupld09a pls09
ON pls14.fscskey = pls09.fscskey
WHERE pls14.gpterms >= 0 AND pls09.gpterms >= 0
GROUP BY pls14.stabr
ORDER BY terminal_pct_change DESC)x ;

Is there a way I can Query Missing numbers in a table?

I work for a Logistics Company and we have to have a 7 digit Pro Number on each piece of freight that is in a pre-determined order. So we know there is gaps in the numbers, but is there any way I can Query the system and find out what ones are missing?
So show me all the numbers from 1000000 to 2000000 that do not exist in column name trace_number.
So as you can see below the sequence goes 1024397, 1024398, then 1051152 so I know there is a substantial gap of 26k pro numbers, but is there anyway to just query the gaps?
Select t.trace_number,
integer(trace_number) as number,
ISNUMERIC(trace_number) as check
from trace as t
left join tlorder as tl on t.detail_number = tl.detail_line_id
where left(t.trace_number,1) in ('0','1','2','3','4','5','6','7','8','9')
and date(pick_up_by) >= current_date - 1 years
and length(t.trace_number) = 7
and t.trace_type = '2'
and site_id in ('SITE5','SITE9','SITE10')
and ISNUMERIC(trace_number) = 'True'
order by 2
fetch first 10000 rows only
I'm not sure what your query has to do with the question, but you can identify gaps using lag()/lead(). The idea is:
select (trace_number + 1) as start_gap,
(next_tn - 1) as end_gap
from (select t.*,
lead(trace_number) order by (trace_number) as next_tn
from t
) t
where next_tn <> trace_number + 1;
This does not find them within a range. It just finds all gaps.
try Something like this (adapt the where condition, put into clause "on") :
with Range (nb) as (
values 1000000
union all
select nb+1 from Range
where nb<=2000000
)
select *
from range f1 left outer join trace f2
on f2.trace_number=f1.nb
and f2.trace_number between 1000000 and 2000000
where f2.trace_number is null

SQL Server connections

Query 1 results in 87 rows:
SELECT
MIN(LEFT(Date_Last_Updated, 4)) AS Year_of_Transfer,
MIN(RIGHT(LEFT(Date_Last_Updated, 6), 2)) AS Month_of_Transfer,
MIN(RIGHT(LEFT(Date_Last_Updated, 8), 2)) AS Day_of_Transfer,
CRS, Loan_Code,
MIN(Date_Last_Updated) AS Min_Date_Last_Updated
FROM
dbo.Transfer_Final_Accounts_CO_SH
WHERE
(LEFT(Date_Last_Updated, 4) >= '2016')
AND (Subcategory = 'Transfer to Workout')
GROUP BY
CRS, Loan_Code
ORDER BY
Loan_Code
Query 2 results in 3400000 rows:
SELECT
Date_Last_Updated,
SUM(NPL_Amount_Last_Quarter) AS NPL_Amount_Last_Quarter,
Loan_Code,
SUM([Total_Balance_€]) AS Total_Balance,
SUM(On_Balance_Amount_Last_Quarter) AS On_Balance_Last_Q,
SUM([Off_Balance_Amount_€]) AS Off_Balance_Last_Q, CRS,
[On_Balance_Amount_€], [Off_Balance_Amount_€], [Total_Balance_€],
NPL_Amount_Last_Quarter AS NPL_Amount_Last_Q,
[NPL_Amount_€], On_Balance_Amount_Last_Quarter,
Material_Bucket, Material_Bucket_Last_Quarter
FROM
dbo.Transfer_Final_Accounts_COM_WORK
GROUP BY
CRS, Date_Last_Updated, Loan_Code, [On_Balance_Amount_€],
[Off_Balance_Amount_€], [Total_Balance_€], NPL_Amount_Last_Quarter,
[NPL_Amount_€], On_Balance_Amount_Last_Quarter, Material_Bucket,
Material_Bucket_Last_Quarter
Query 3 connects the 2 above and results in 0 rows:
SELECT
dbo.Transfer_to_Workout_Total_Balances.Total_Balance,
dbo.Transfer_to_Workout_Total_Balances.On_Balance_Last_Q,
dbo.Transfer_to_Workout_Total_Balances.Off_Balance_Last_Q,
dbo.Transfer_to_Workout_Total_Balances.NPL_Amount_Last_Quarter,
dbo.Transfer_to_workout_min_dates.Year_of_Transfer,
dbo.Transfer_to_workout_min_dates.Month_of_Transfer,
dbo.Transfer_to_workout_min_dates.Day_of_Transfer,
dbo.Transfer_to_workout_min_dates.CRS AS Expr1,
dbo.Transfer_to_workout_min_dates.Loan_Code AS Expr2,
dbo.Transfer_to_workout_min_dates.Min_Date_Last_Updated,
dbo.Transfer_to_Workout_Total_Balances.[On_Balance_Amount_€],
dbo.Transfer_to_Workout_Total_Balances.[Off_Balance_Amount_€],
dbo.Transfer_to_Workout_Total_Balances.[Total_Balance_€],
dbo.Transfer_to_Workout_Total_Balances.[NPL_Amount_€],
dbo.Transfer_to_Workout_Total_Balances.NPL_Amount_Last_Q,
dbo.Transfer_to_Workout_Total_Balances.On_Balance_Amount_Last_Quarter,
dbo.Transfer_to_Workout_Total_Balances.Material_Bucket,
dbo.Transfer_to_Workout_Total_Balances.Material_Bucket_Last_Quarter
FROM
dbo.Transfer_to_workout_min_dates
INNER JOIN
dbo.Transfer_to_Workout_Total_Balances ON dbo.Transfer_to_workout_min_dates.Loan_Code = dbo.Transfer_to_Workout_Total_Balances.Loan_Code
AND dbo.Transfer_to_workout_min_dates.Min_Date_Last_Updated = dbo.Transfer_to_Workout_Total_Balances.Date_Last_Updated
What is wrong with the connections why am I not taking the results from the above?
When your query 3 doesn't return a row but the underlaying tables dbo.Transfer_to_workout_min_dates and dbo.Transfer_to_Workout_Total_Balances contain data, then most likely the join condition (ON) is not met for any entries in those tables.
Try to remove one of the terms (I would start with the Min_Date_Last_Updated one). It should change something. Then check in the result the both fields. You'll see that they differ and so are never equal.

Trouble with SQL UNION operation

I have the following table:
I am trying to create an SQL query that returns a table that returns three fields:
Year (ActionDate), Count of Built (actiontype = 12), Count of Lost (actiontype = a few different ones)
Bascially, ActionType is a lookup code. So, I'd get back something like:
YEAR CountofBuilt CountofLost
1905 30 18
1929 12 99
1940 60 1
etc....
I figured this would take two SELECT statements put together with a UNION.
I tried the following below but it only spits back two columns (year and countbuilt). My countLost field doesn't appear
My sql currently (MS Access):
SELECT tblHist.ActionDate, Count(tblHist.ActionDate) as countBuilt
FROM ...
WHERE ((tblHist.ActionType)=12)
GROUP BY tblHist.ActionDate
UNION
SELECT tblHist.ActionDate, Count(tblHist.ActionDate) as countLost
FROM ...
WHERE (((tblHist.ActionType)<>2) AND
((tblHist.ActionType)<>3))
GROUP BY tblHist.ActionDate;
Use:
SELECT h.actiondate,
SUM(IIF(h.actiontype = 12, 1, 0)) AS numBuilt,
SUM(IIF(h.actiontype NOT IN (2,3), 1, 0)) AS numLost
FROM tblHist h
GROUP BY h.actiondate
You should not use UNION for such queries. There are many ways to do what you want, for example
Updated to fit access syntax
SELECT tblHist.ActionDate,
COUNT(SWITCH(tblHist.ActionType = 12,1)) as countBuilt,
COUNT(SWITCH(tblHist.ActionType <>1 OR tblHist.ActionType <>2 OR ...,1)) as countLost
FROM ..
WHERE ....
GROUP BY tblHist.ActionDate

What is a simple and efficient way to find rows with time-interval overlaps in SQL?

I have two tables, both with start time and end time fields. I need to find, for each row in the first table, all of the rows in the second table where the time intervals intersect.
For example:
<-----row 1 interval------->
<---find this--> <--and this--> <--and this-->
Please phrase your answer in the form of a SQL WHERE-clause, AND consider the case where the end time in the second table may be NULL.
Target platform is SQL Server 2005, but solutions from other platforms may be of interest also.
SELECT *
FROM table1,table2
WHERE table2.start <= table1.end
AND (table2.end IS NULL OR table2.end >= table1.start)
It's sound very complicated until you start working from reverse.
Below I illustrated ONLY GOOD CASES (no overlaps)! defined by those 2 simple conditions, we have no overlap ranges if condA OR condB is TRUE, so we going to reverse those:
NOT condA AND NOT CondB, in our case I just reversed signs (> became <=)
/*
|--------| A \___ CondA: b.ddStart > a.ddEnd
|=========| B / \____ CondB: a.ddS > b.ddE
|+++++++++| A /
*/
--DROP TABLE ran
create table ran ( mem_nbr int, ID int, ddS date, ddE date)
insert ran values
(100, 1, '2012-1-1','2012-12-30'), ----\ ovl
(100, 11, '2012-12-12','2012-12-24'), ----/
(100, 2, '2012-12-31','2014-1-1'),
(100, 3, '2014-5-1','2014-12-14') ,
(220, 1, '2015-5-5','2015-12-14') , ---\ovl
(220, 22, '2014-4-1','2015-5-25') , ---/
(220, 3, '2016-6-1','2016-12-16')
select DISTINCT a.mem_nbr , a.* , '-' [ ], b.dds, b.dde, b.id
FROM ran a
join ran b on a.mem_nbr = b.mem_nbr -- match by mem#
AND a.ID <> b.ID -- itself
AND b.ddS <= a.ddE -- NOT b.ddS > a.ddE
AND a.ddS <= b.ddE -- NOT a.ddS > b.ddE
"solutions from other platforms may be of interest also."
SQL Standard defines OVERLAPS predicate:
Specify a test for an overlap between two events.
<overlaps predicate> ::=
<row value constructor 1> OVERLAPS <row value constructor 2>
Example:
SELECT 1
WHERE ('2020-03-01'::DATE, '2020-04-15'::DATE) OVERLAPS
('2020-02-01'::DATE, '2020-03-15'::DATE)
-- 1
db<>fiddle demo
select * from table_1
right join
table_2 on
(
table_1.start between table_2.start and table_2.[end]
or
table_1.[end] between table_2.start and table_2.[end]
or
(table_1.[end] > table_2.start and table_2.[end] is null)
)
EDIT: Ok, don't go for my solution, it perfoms like shit. The "where" solution is 14x faster. Oops...
Some statistics: running on a db with ~ 65000 records for both table 1 and 2 (no indexing), having intervals of 2 days between start and end for each row, running for 2 minutes in SQLSMSE (don't have the patience to wait)
Using join: 8356 rows in 2 minutes
Using where: 115436 rows in 2 minutes
And what, if you want to analyse such an overlap on a minute precision with 70m+ rows?
the only solution i could make up myself was a time dimension table for the join
else the dublicate-handling became a headache .. and the processing cost where astronomical