refer to value of unnest (or other srf) in WHERE - sql

While I'm trying to find good answer on my question Analog of OUTER APPLY in other RDBMS (not SQL Server) I've found pretty nice PostgreSQL solution:
create table Transactions
(
ID int, Date timestamp, Amount decimal(29, 2), Amount2 decimal(29, 2)
);
insert into Transactions (ID, Date, Amount, Amount2)
select 1, current_timestamp, 100.00, null union all
select 2, current_timestamp, 25.00, 75.00;
select
T.ID,
T.Date,
unnest(array[T.Amount, T.Amount2]) as Amount
from Transactions as T
SQL FIDDLE
the point is to turn some columns into rows with most readable and elegant code I could get. But I don't want to see null columns as rows. Is there any way I could use value from unnest in WHERE clause of the query?

You can use a subquery and where to filter out the NULL values:
select id, date, Amount
from (select t.*, unnest(array[T.Amount, T.Amount2]) as Amount
from Transactions as T
) t
where Amount is not null;
Postgres doesn't allow the unnest direction in the where clause.
EDIT:
Unnest uses the length of the array to determine the number of rows. You can do this with standard SQL and no subquery, but you will probably find it messier:
select T.ID, T.Date,
(case when n = 1 then T.Amount
when n = 2 then T.Amount2
end) as Amount
from Transactions T cross join
(select 1 as n union all select 2) n
where (case when n = 1 then T.Amount
when n = 2 then T.Amount2
end) is not null;

Related

Subtraction of two SELECT statements in SQL (redshift)

Can someone explain why the below doesn't work?
((SELECT COUNT(*) FROM Table1) - (SELECT Count(Metric) FROM Table1)) as X
Count(*) will give me all the rows in the table and Count(Metric) will give me the non-null values in the Metric column. So the difference between these will give me the number of null values in the Metric column and I have labelled this column X. I just want the difference between the two in Column X but not sure why it isn't working.
By the way, I know I can get it to work via the below:
SELECT COUNT(*) as a, count(metric) as b, COUNT(*)-COUNT(metric) as c
You would need to select the result:
SELECT ((SELECT COUNT(*) FROM Table1) - (SELECT Count(Metric) FROM Table1)) as X
But it is simpler to use conditional aggregation:
SELECT SUM(CASE WHEN Metrics IS NULL THEN 1 ELSE 0 END) X FROM table1
A SELECT query needs to start with SELECT (or WITH or a parenthesis if the query is a compound query with a set operator such as UNION ALL).
One method is:
SELECT ((SELECT COUNT(*) FROM Table1) - (SELECT Count(Metric) FROM Table1)) as X
A better method is:
SELECT COUNT(*) - Count(Metric) as X
FROM Table1
Not sure about amazon-redshift, but in standard SQL I would just count the records where the field is null instead of counting all minus where they are not null.
SELECT COUNT(*) FROM Table1 WHERE Metric IS NULL;

How to minus current and previous value in SQL Server

Have one table, need to minus one column previous and current amount. Table value is below, need to write syntax for Cal-Amount column
Id Amount Cal-Amount
1 100 0
2 200 0
3 400 0
4 500 0
Cal-Amount calculation formula with sample value
Id Amount Cal-Amount
1 100 (0-100)=100
2 200 (100-200)=100
3 400 (200-400)=200
4 500 (400-500)=100
Need SQL syntax to minus column current and previous value
LAG is one option if you are using SQL Server 2012 or later:
SELECT
Id,
Amount,
LAG(Amount, 1, 0) OVER (ORDER BY Id) - Amount AS [Cal-Amount]
FROM yourTable;
If you are using an earlier version of SQL Server, then we can use a self join:
SELECT
Id,
Amount,
COALESCE(t2.Amount, 0) - t1.Amount AS [Cal-Amount]
FROM yourTable t1
LEFT JOIN yourTable t2
ON t1.Id = t2.Id + 1;
But note that the self join option might only work if the Id values are continuous. LAG is probably the most efficient way to do this, and is also robust to non sequential Id values, so long as the order is correct.
Well, Tim beat me to the lag(), so here's the old-school using join:
select t.Id,t.Amount,t.Amount-isnull(t2.Amount,0) AS [Cal-Amount]
from yourtable t
left join yourtable t2 on t.id=t2.id+1
SQL Server 2012 or newer:
Select
ID, Amount, [Cal-Amount] = Amount - LAG(Amount, 1, 0) OVER (ORDER BY Id)
From
table
or
Select
current.ID, Current.Amount, Current.Amount - Isnull(Prior.Amount, 0)
from
table current
left join
table prior on current.id - 1 = prior.id
You can use the LAG function if your SQL Server >= 2012
declare #t table (id int, amount1 int)
insert into #t
values (1, 100), (2, 200), (3, 400), (4, 500)
select
*, amount1 - LAG(amount1, 1, 0) over (order by id) as CalAmount
from
#t
You can also use apply :
select t.*, t.Amount - coalesce(tt.Amount, 0) as CalAmount
from table t outer apply (
select top (1) *
from table t1
where t1.id < t.id
order by t1.id desc
) tt;

SAP HANA | With Clause performance

We are using SAP HANA 1.0 SPS12.
We have daywise table like below -
select trans_date,article,measure1,measure2 from table_1
Volume of table ~ 5 millions rows
we need to see data like -
select 'day-1',sum(measure1),sum(meaure2) from table1 where trans_date=add_days(current_date,-1) group by 'day-1'
union all
select 'day-2',sum(measure1),sum(meaure2) from table1 where trans_date>=add_days(current_date,-2) group by 'day-2'
union all
select 'WTD',sum(measure1),sum(meaure2) from table1 where trans_date>=add_days(current_date,-7) group by 'WTD'
union all
select 'WTD-1',sum(measure1),sum(meaure2) from table1 where trans_date>=add_days(current_date,-15) and trans_Date <= add_days(current_date,-7) group by 'WTD-1'
so on and so forth for MTD,MTD-1,MTD-2,YTD.
Performance wise is it better to use WITH CLAUSE and hold data for one year and then split according to timeframe? OR Is it better to use separate aggregation for each timeframe as shown above.
As far as I understand, in RDBMSs like Oracle, WITH CLAUSE materializes results and use it from the memory. SAP HANA is In Memory database itself. Does using WITH CLAUSE in SAP HANA gives distinctive performance edge?
Query using WITH CLAUSE -
WITH t1 as
(
select trans_date,sum(measure1),sum(meaure2) from table1 where trans_date>=add_days(current_date,-365)
)
select 'day-1',sum(measure1),sum(meaure2) from t1 where trans_date=add_days(current_date,-1) group by 'day-1'
union all
select 'day-2',sum(measure1),sum(meaure2) from t1 where trans_date>=add_days(current_date,-2) group by 'day-2'
union all
select 'WTD',sum(measure1),sum(meaure2) from t1 where trans_date>=add_days(current_date,-7) group by 'WTD'
union all
select 'WTD-1',sum(measure1),sum(meaure2) from t1 where trans_date>=add_days(current_date,-15)
and trans_Date <= add_days(current_date,-7)
group by 'WTD-1'
If you care about performance, putting the data in a single row should be much better:
select sum(case when trans_date = add_days(current_date, -1) then measure1 end) as measure1_day1,
sum(case when trans_date = add_days(current_date, -1) then measure2 end) as measure2_day1,
sum(case when trans_date = add_days(current_date, -2) then measure1 end) as measure1_day2,
sum(case when trans_date = add_days(current_date, -2) then measure2 end) as measure2_day2,
. . .
from table1
where trans_date >= add_days(current_date, -15);
You can unpivot the results afterwards, if you really need the values in separate rows.
Alternatively, you can do:
select days, sum(measure1), sum(measure2)
from (select 1 as days from dummy union all
select 2 from dummy union all
select 7 from dummy union all
select 15 from dummy
) d left join
table1 t
on t.trans_date = add_days(current_date, - d.days)
group by days
order by days;

How to join two tables with the same number of rows in SQLite?

I have almost the same problem as described in this question. I have two tables with the same number of rows, and I would like to join them together one by one.
The tables are ordered, and I would like to keep this order after the join, if it is possible.
There is a rowid based solution for MSSql, but in SQLite rowid can not be used if the table is coming from a WITH statement (or RECURSIVE WITH).
It is guaranteed that the two tables have the exact same number of rows, but this number is not known beforehand. It is also important to note, that the same element may occur more than twice. The results are ordered, but none of the columns are unique.
Example code:
WITH
table_a (n) AS (
SELECT 2
UNION ALL
SELECT 4
UNION ALL
SELECT 5
),
table_b (s) AS (
SELECT 'valuex'
UNION ALL
SELECT 'valuey'
UNION ALL
SELECT 'valuez'
)
SELECT table_a.n, table_b.s
FROM table_a
LEFT JOIN table_b ON ( table_a.rowid = table_b.rowid )
The result I would like to achieve is:
(2, 'valuex'),
(4, 'valuey'),
(5, 'valuez')
SQLFiddle: http://sqlfiddle.com/#!5/9eecb7/6888
This is quite complicated in SQLite -- because you are allowing duplicates. But you can do it. Here is the idea:
Summarize the table by the values.
For each value, get the count and offset from the beginning of the values.
Then use a join to associate the values and figure out the overlap.
Finally use a recursive CTE to extract the values that you want.
The following code assumes that n and s are ordered -- as you specify in your question. However, it would work (with small modifications) if another column specified the ordering.
You will notice that I have included duplicates in the sample data:
WITH table_a (n) AS (
SELECT 2 UNION ALL
SELECT 4 UNION ALL
SELECT 4 UNION ALL
SELECT 4 UNION ALL
SELECT 5
),
table_b (s) AS (
SELECT 'valuex' UNION ALL
SELECT 'valuey' UNION ALL
SELECT 'valuey' UNION ALL
SELECT 'valuez' UNION ALL
SELECT 'valuez'
),
a as (
select a.n, count(*) as a_cnt,
(select count(*) from table_a a2 where a2.n < a.n) as a_offset
from table_a a
group by a.n
),
b as (
select b.s, count(*) as b_cnt,
(select count(*) from table_b b2 where b2.s < b.s) as b_offset
from table_b b
group by b.s
),
ab as (
select a.*, b.*,
max(a.a_offset, b.b_offset) as offset,
min(a.a_offset + a.a_cnt, b.b_offset + b.b_cnt) - max(a.a_offset, b.b_offset) as cnt
from a join
b
on a.a_offset + a.a_cnt - 1 >= b.b_offset and
a.a_offset <= b.b_offset + b.b_cnt - 1
),
cte as (
select n, s, offset, cnt, 1 as ind
from ab
union all
select n, s, offset, cnt, ind + 1
from cte
where ind < cnt
)
select n, s
from cte
order by n, s;
Here is a DB Fiddle showing the results.
I should note that this would be much simpler in almost any other database, using window functions (or perhaps variables in MySQL).
Since the tables are ordered, you can add row_id values by comparing n values.
But still the best way in order to get better performance would be inserting the ID values while creating the tables.
http://sqlfiddle.com/#!5/9eecb7/7014
WITH
table_a_a (n, id) AS
(
WITH table_a (n) AS
(
SELECT 2
UNION ALL
SELECT 4
UNION ALL
SELECT 5
)
SELECT table_a.n, (select count(1) from table_a b where b.n <= table_a.n) id
FROM table_a
) ,
table_b_b (n, id) AS
(
WITH table_a (n) AS
(
SELECT 'valuex'
UNION ALL
SELECT 'valuey'
UNION ALL
SELECT 'valuez'
)
SELECT table_a.n, (select count(1) from table_a b where b.n <= table_a.n) id
FROM table_a
)
select table_a_a.n,table_b_b.n from table_a_a,table_b_b where table_a_a.ID = table_b_b.ID
or convert the input set to comma separated list and try like this:
http://sqlfiddle.com/#!5/9eecb7/7337
WITH RECURSIVE table_b( id,element, remainder ) AS (
SELECT 0,NULL AS element, 'valuex,valuey,valuz,valuz' AS remainder
UNION ALL
SELECT id+1,
CASE
WHEN INSTR( remainder, ',' )>0 THEN
SUBSTR( remainder, 0, INSTR( remainder, ',' ) )
ELSE
remainder
END AS element,
CASE
WHEN INSTR( remainder, ',' )>0 THEN
SUBSTR( remainder, INSTR( remainder, ',' )+1 )
ELSE
NULL
END AS remainder
FROM table_b
WHERE remainder IS NOT NULL
),
table_a( id,element, remainder ) AS (
SELECT 0,NULL AS element, '2,4,5,7' AS remainder
UNION ALL
SELECT id+1,
CASE
WHEN INSTR( remainder, ',' )>0 THEN
SUBSTR( remainder, 0, INSTR( remainder, ',' ) )
ELSE
remainder
END AS element,
CASE
WHEN INSTR( remainder, ',' )>0 THEN
SUBSTR( remainder, INSTR( remainder, ',' )+1 )
ELSE
NULL
END AS remainder
FROM table_a
WHERE remainder IS NOT NULL
)
SELECT table_b.element, table_a.element FROM table_b, table_a WHERE table_a.element IS NOT NULL and table_a.id = table_b.id;
SQL
SELECT a1.n, b1.s
FROM table_a a1
LEFT JOIN table_b b1
ON (SELECT COUNT(*) FROM table_a a2 WHERE a2.n <= a1.n) =
(SELECT COUNT(*) FROM table_b b2 WHERE b2.s <= b1.s)
Explanation
The query simply counts the number of rows up until the current one for each table (based on the ordering column) and joins on this value.
Demo
See SQL Fiddle demo.
Assumptions
A single column in used for the ordering in each table. (But the query could easily be modified to allow multiple ordering columns).
The ordering values in each table are unique.
The values in the ordering column aren't necessarily the same between the two tables.
It is known that table_a contains either the same or more rows than table_b. (If this isn't the case then a FULL OUTER JOIN would need to be emulated since SQLite doesn't provide one.)
No further changes to the table structure are allowed. (If they are, it would be more efficient to have pre-populated columns for the ordering).
Either way...
Use something like
WITH
v_table_a (n, rowid) AS (
SELECT 2, 1
UNION ALL
SELECT 4, 2
UNION ALL
SELECT 5, 3
),
v_table_b (s, rowid) AS (
SELECT 'valuex', 1
UNION ALL
SELECT 'valuey', 2
UNION ALL
SELECT 'valuez', 3
)
SELECT v_table_a.n, v_table_b.s
FROM v_table_a
LEFT JOIN v_table_b ON ( v_table_a.rowid = v_table_b.rowid );
for "virtual" tables (with WITH or without),
WITH RECURSIVE vr_table_a (n, rowid) AS (
VALUES (2, 1)
UNION ALL
SELECT n + 2, rowid + 1 FROM vr_table_a WHERE rowid < 3
)
, vr_table_b (s, rowid) AS (
VALUES ('I', 1)
UNION ALL
SELECT s || 'I', rowid + 1 FROM vr_table_b WHERE rowid < 3
)
SELECT vr_table_a.n, vr_table_b.s
FROM vr_table_a
LEFT JOIN vr_table_b ON ( vr_table_a.rowid = vr_table_b.rowid );
for "virtual" tables using recursive WITHs (in this example the values are others then yours, but I guess you get the point) and
CREATE TABLE p_table_a (n INT);
INSERT INTO p_table_a VALUES (2), (4), (5);
CREATE TABLE p_table_b (s VARCHAR(6));
INSERT INTO p_table_b VALUES ('valuex'), ('valuey'), ('valuez');
SELECT p_table_a.n, p_table_b.s
FROM p_table_a
LEFT JOIN p_table_b ON ( p_table_a.rowid = p_table_b.rowid );
for physical tables.
I'd be careful with the last one though. A quick test shows, that the numbers of rowid are a) reused -- when some rows are deleted and others are inserted, the inserted rows get the rowids from the old rows (i.e. rowid in SQLite isn't unique past the lifetime of a row, whereas e.g. Oracle's rowid AFAIR is) -- and b) corresponds to the order of insertion. But I don't know and didn't find a clue in the documentation, if that's guaranteed or is subject to change in other/future implementations. Or maybe it's just a mere coincidence in my test environment.
(In general physical order of rows may be subject to change (even within the same database using the same DMBS as a result of some reorganization) and is therefore no good choice to rely on. And it's not guaranteed, a query will return the result ordered by physical position in the table as well (it might use the order of some index instead or have a partial result ordered some other way influencing the output's order). Consider designing your tables using common (sort) keys in corresponding rows for ordering and to join on.)
You can create temp tables to carry CTE data row. then JOIN them by sqlite row_id column.
CREATE TEMP TABLE temp_a(n integer);
CREATE TEMP TABLE temp_b(n VARCHAR(255));
WITH table_a(n) AS (
SELECT 2 n
UNION ALL
SELECT 4
UNION ALL
SELECT 5
UNION ALL
SELECT 5
)
INSERT INTO temp_a (n) SELECT n FROM table_a;
WITH table_b (n) AS
(
SELECT 'valuex'
UNION ALL
SELECT 'valuey'
UNION ALL
SELECT 'valuez'
UNION ALL
SELECT 'valuew'
)
INSERT INTO temp_b (n) SELECT n FROM table_b;
SELECT *
FROM temp_a a
INNER JOIN temp_b b on a.rowid = b.rowid;
sqlfiddle:http://sqlfiddle.com/#!5/9eecb7/7252
It is possible to use the rowid inside a with statement but you need to select it and make it available to the query using it.
Something like this:
with tablea AS (
select id, rowid AS rid from someids),
tableb AS (
select details, rowid AS rid from somedetails)
select tablea.id, tableb.details
from
tablea
left join tableb on tablea.rid = tableb.rid;
It is however as they have already warned you a really bad idea. What if the app breaks after inserting in one table but before the other one? What if you delete an old row? If you want to join two tables you need to specify the field to do so. There are so many things that could go wrong with this design. The most similar thing to this would be an incremental id field that you would save in the table and use in your application. Even simpler, make those into one table.
Read this link for more information about the rowid: https://www.sqlite.org/lang_createtable.html#rowid
sqlfiddle: http://sqlfiddle.com/#!7/29fd8/1
It is possible to use the rowid inside a with statement but you need to select it and make it available to the query using it. Something like this:
with tablea AS (select id, rowid AS rid from someids),
tableb AS (select details, rowid AS rid from somedetails)
select tablea.id, tableb.details
from
tablea
left join tableb on tablea.rid = tableb.rid;
The problem statement indicates:
The tables are ordered
If this means that the ordering is defined by the ordering of the values in the UNION ALL statements, and if SQLite respects that ordering, then the following solution may be of interest because, apart from small tweaks to the last three lines of the sample program, it adds just two lines:
A(rid,n) AS (SELECT ROW_NUMBER() OVER ( ORDER BY 1 ) rid, n FROM table_a),
B(rid,s) AS (SELECT ROW_NUMBER() OVER ( ORDER BY 1 ) rid, s FROM table_b)
That is, table A is table_a augmented with a rowid, and similarly for table B.
Unfortunately, there is a caveat, though it might just be the result of my not having found the relevant specifications. Before delving into that, however, here is the full proposed solution:
WITH
table_a (n) AS (
SELECT 2
UNION ALL
SELECT 4
UNION ALL
SELECT 5
),
table_b (s) AS (
SELECT 'valuex'
UNION ALL
SELECT 'valuey'
UNION ALL
SELECT 'valuez'
),
A(rid,n) AS (SELECT ROW_NUMBER() OVER ( ORDER BY 1 ) rid, n FROM table_a),
B(rid,s) AS (SELECT ROW_NUMBER() OVER ( ORDER BY 1 ) rid, s FROM table_b)
SELECT A.n, B.s
FROM A LEFT JOIN B
ON ( A.rid = B.rid );
Caveat
The proposed solution has been tested against a variety of data sets using sqlite version 3.29.0, but whether or not it is, and will continue to be, "guaranteed" to work is unclear to me.
Of course, if SQLite offers no guarantees with respect to the ordering of the UNION ALL statements (that is, if the question is based on an incorrect assumption), then it would be interesting to see a well-founded reformulation.

Counting the rows of a column where the value of a different column is 1

I am using a select count distinct to count the number of records in a column. However, I only want to count the records where the value of a different column is 1.
So my table looks a bit like this:
Name------Type
abc---------1
def----------2
ghi----------2
jkl-----------1
mno--------1
and I want the query only to count abc, jkl and mno and thus return '3'.
I wasn't able to do this with the CASE function, because this only seems to work with conditions in the same column.
EDIT: Sorry, I should have added, I want to make a query that counts both types.
So the result should look more like:
1---3
2---2
SELECT COUNT(*)
FROM dbo.[table name]
WHERE [type] = 1;
If you want to return the counts by type:
SELECT [type], COUNT(*)
FROM dbo.[table name]
GROUP BY [type]
ORDER BY [type];
You should avoid using keywords like type as column names - you can avoid a lot of square brackets if you use a more specific, non-reserved word.
I think you'll want (assuming that you wouldn't want to count ('abc',1) twice if it is in your table twice):
select count(distinct name)
from mytable
where type = 1
EDIT: for getting all types
select type, count(distinct name)
from mytable
group by type
order by type
select count(1) from tbl where type = 1
;WITH MyTable (Name, [Type]) AS
(
SELECT 'abc', 1
UNION
SELECT 'def', 2
UNION
SELECT 'ghi', 2
UNION
SELECT 'jkl', 1
UNION
SELECT 'mno', 1
)
SELECT COUNT( DISTINCT Name)
FROM MyTable
WHERE [Type] = 1