Average by rows with sqlite - sql

I'm working with a sqlite database.
The tables are:
ID_TABLE POINTS_A_TABLE POINTS_B_TABLE
id number id_a points_a id_a points_a
-------------- ---------------- ----------------
smith 1 smith 11 ...
gordon 22 gordon 11
butch 3 butch 11
sparrow 25 sparrow
white 76 white 46
and so on. After these commands
select id,
points_a_table.points_a, points_b_table.points_a, points_c_table.points_a, points_d_table.points_a
from id_table
left join points_a_table on points_a_table.id_a = id_table.id
left join points_b_table on points_b_table.id_a = id_table.id
left join points_c_table on points_c_table.id_a = id_table.id
left join points_d_table on points_d_table.id_a = id_table.id
group by id
I got this result, on each row I have the id and the points associated with the id.
Now I'd like to get an average of points by row, sorted by average in descending order.
What I want is:
sparrow| 56 [(44+68)/2]
white | 41 ([46+67+11)/3]
smith | 33 [(11+25+65)/3]
butch | 24 [(11+26+11)/3]
gordon | 11 [11/1]
How can I do that?
Thanks.

If you smush together all point tables, you can then simply compute the average for each group:
SELECT id,
avg(points_a)
FROM (SELECT id_a AS id, points_a FROM points_a_table
UNION ALL
SELECT id_a AS id, points_a FROM points_b_table
UNION ALL
SELECT id_a AS id, points_a FROM points_c_table
UNION ALL
SELECT id_a AS id, points_a FROM points_d_table)
GROUP BY id
ORDER BY avg(points_a) DESC;

Related

left out join returns the duplicate records

I have 3 tables where i have to join and get the latest data. The 3 tables are as follows "STUDENT", "MATH", "ENGLISH".
STUNDET tables contain:
ID NAME CLASS CODE MODIFIED_DATE
-------------------------------------
1 ABC First 1234 01-10-2020
2 EFG Second 3421 01-01-2020
3 XYZ Third 1434 01-01-2020
1 ABC First 9999 01-01-2021
MATH table contain:
ID MSCORE MDATE
----------------
1 80 20-09-2020
2 71 10-12-2020
1 74 04-03-2021
2 90 13-03-2020
ENGLISH table contains:
ID ESCORE EDATE
---------------
1 72 21-04-2021
2 43 19-01-2021
3 60 01-01-2021
3 38 01-05-2021
Result should be:
ID NAME CODE MSCORE MDATE ESCORE EDATE
----------------------------------------------
1 ABC 9999 74 04-03-2021 72 21-04-2021
2 EFG 71 10-12-2020 43 19-01-2021
3 XYZ 38 01-05-2021
But i am getting duplicate records for each ID. when i am using the below query.
select a.ID,a.NAME,a.CODE,b.MSCORE,b.MDATE,c.ESCORE,c.EDATE from STUDENT a LEFT OUTER JOIN MATH b ON a.ID=b.ID LEFT OUTER JOIN ENGLISH c ON a.ID=c.ID;
Please someone let me know what might be the correct query to fetch each record for a ID form tables based on the latest date given in MATH and ENGLISH table.
EDIT:
I have added Code column to STUDENT table, and when i run the query i should get the latest code data for the ID.
If you want the most recent row from each table, use window functions:
select s.*, m.MSCORE, m.MDATE, e.ESCORE, e.EDATE
from (select s.*,
row_number() over (partition by s.id order by modified_date desc) as seqnum
from STUDENT s
) s LEFT OUTER JOIN
(select m.*,
row_number() over (partition by m.id order by m.mdate desc) as seqnum
from MATH m
) m
on m.ID = s.ID and m.seqnum = 1 LEFT OUTER JOIN
(select e.*,
row_number() over (partition by e.id order by e.edate desc) as seqnum
from ENGLISH e
) e
on e.id = s.id and e.seqnum = 1
where s.seqnum = 1;
Note that I have replaced your meaningless table aliases with abbreviations for the table names. This makes the query much simpler to read and maintain.
A second way to do this is to use a correlated sub-query on each table before joining them to pick latest record for each ID:
Select s.id, s.name, s.code,m.mscore,m.MDATE, e.ESCORE, e.EDATE
From
(Select * from Student s1
Where modified_date=(Select max(modified_date
From Student s2
Where s2.id=s1.id)
) s LEFT OUTER JOIN
(Select * from Math m1
Where mdate=(Select max(mdate)
From Math m2
Where m2.id=m1.id)
) m ON s.id=m.id LEFT OUTER JOIN
(Select * from English e1
Where edate=(Select max(edate)
From English e2
Where e2.id=e1.id)
) e ON s.id=e.id
Also, you should really make your 3 modified dates into date-time data types to distinguish among different modifications done the same day. If two such records appear in your tables, this query fails by bringing back both records while Gordon Linoff answer could return a row that was not the most recent.

Ranking of a tuple in another table

So I have 2 tables, team A and team B, with their score. I want the rank of the score of every member of team A within team B using SQL or vertica, as shown below
Team A Table
user score
-------------
asa 100
bre 200
cqw 50
duy 50
Team B Table
user score
------------
gfh 20
ewr 80
kil 70
cvb 90
Output:
Team A Table
user score rank in team B
------------------------------
asa 100 1
bre 200 1
cqw 50 4
duy 50 4
Try this - and this only works in Vertica.
INTERPOLATE PREVIOUS VALUE is an outer-join predicate specific to Vertica that joins two tables on non-equal columns, using the 'last known' value in the outer-joined table to make a match succeed.
WITH
-- input, don't use in query itself
table_a (the_user,score) AS (
SELECT 'asa',100
UNION ALL SELECT 'bre',200
UNION ALL SELECT 'cqw',50
UNION ALL SELECT 'duy',50
)
,
table_b(the_user,score) AS (
SELECT 'gfh',20
UNION ALL SELECT 'ewr',80
UNION ALL SELECT 'kil',70
UNION ALL SELECT 'cvb',90
)
-- end of input - start WITH clause here
,
ranked_b AS (
SELECT
RANK() OVER(ORDER BY score DESC) AS the_rank
, *
FROM table_b
)
SELECT
a.the_user AS a_user
, a.score AS a_score
, b.the_rank AS rank_in_team_b
FROM table_a a
LEFT JOIN ranked_b b
ON a.score INTERPOLATE PREVIOUS VALUE b.score
ORDER BY 1
;
a_user|a_score|rank_in_team_b
asa | 100| 1
bre | 200| 1
cqw | 50| 4
duy | 50| 4
Simple correlated query should do:
select
a.*,
(select count(*) + 1 from table_b b where b.score > a.score) rank_in_b
from table_a a;
All you need to do is count the number of people with more score than current user in the table b and add 1 to it to get the rank.

Oracle SQL - Joining tables to include one column to the other (vice versa)

These are my sample tables, columns and records...
Table: tbl1
-----------------------
Columns: ID | DEPT | WK | MANHRS
Records: 01 A 1 8
02 A 2 2
Table: tbl2
--------------------------------
Columns: ID | DEPT | WK | WAGES
Records: 01 A 1 3
02 A 2 5
Scenario:
I want to have a result where two tables are joined and MANHRS and WAGES columns are both together in the result set.
Expected output of the result table:
Columns: ID | DEPT | WK | MANHRS | WAGES
01 A 1 8 3
02 A 2 2 5
I tried UNION but didn't get my expected result. :(
How to do this?
The proper way to write the query is:
SELECT t1.*, t2.WAGES
FROM tbl1 t1 JOIN
tbl2 t2
ON t1.DEPT = t2.DEPT and t1.WK = t2.WK;
Notes:
Never use commas in the FROM clause. Always use proper, explicit JOIN syntax.
I'm not sure if ID should be in the JOIN conditions.
If you want all rows in both tables, but some might be missing, then use FULL JOIN.
You can write the query with the USING clause:
SELECT ID, DEPT, WK, t1.MANHRS, t2.WAGES
FROM tbl1 t1 JOIN
tbl2 t2
USING (ID, DEPT, WK);
This is particularly useful if you are using a FULL JOIN.
Assuming that you should join by DEPT and WK:
SELECT t1.*, t2.WAGES
FROM tbl1 t1, tbl2 t2
where t1.DEPT = t2.DEPT and t1.WK = t2.WK

SQL join - duplicate rows

I have three tables (simplified version - the whole picture is a bit more complex).
TABLE: CUSTOMER TABLE: PURCHASE1 TABLE: PURCHASE2
=============== ======================= =======================
CustomerID CustomerID | ProductID CustomerID | ProductID
--------------- ------------|---------- ------------|----------
1 1 | 51 1 | 81
2 1 | 52 1 | 82
3 2 | 52 1 | 83
I know the table structure isn't the best but that's not what I need help with. The products held in the purchase tables are of different types, if that helps to provide context.
I'm trying to join the tables, using a query like this:
Select
customer.customerid, purchase1.productid as P1,
purchase2.productid as P2
From
customer
Left join
purchase1 on customer.customerid = purchase1.customerid
Left join
purchase2 on customer.customerid = purchase2.customerid
Where
customer.customerid = 1;
This produces the following:
CustomerID | P1 | P2
--------------------
1 | 51 | 81
1 | 51 | 82
1 | 51 | 83
1 | 52 | 81
1 | 52 | 82
1 | 52 | 83
How do I get it to do this instead?
CustomerID | P1 | P2
-----------|------|---
1 | 51 | null
1 | 52 | null
1 | null | 81
1 | null | 82
1 | null | 83
The first table has a row for every combination of P1 and P2. The second table only has a row for each customer-product combination.
Can I do this without using UNION? The reason I ask, is that because the query will become more complex, using columns from other rows that aren't in PURCHASE1 or PURCHASE2.
If I have to use UNION, how can I do it such that I can still select from other tables and have additional columns in my query?
Use Union . See DEMO. In union, you have to have same number of columns in both queries so use NULL to match number of column in both query
Select * from (Select customer.customerid, purchase1.productid as P1, NULL as P2
from customer
INNER join purchase1
on customer.customerid = purchase1.customerid
UNION ALL
Select customer.customerid, NULL as P1, purchase2.productid as P2
from customer
INNER join purchase2
on customer.customerid = purchase2.customerid) tb
where tb.customerid = 1;
I would do it this way:
select customerid, p1, p2
from customer
left join (
select customerid, productid p1, null p2 from purchase1
union all
select customerid, null p1, productid p2 from purchase2
) using (customerid)
where customerid = 1;
SQLFiddle demo
Now you can attach rest of tables without repeated logic.
I would first of all union up all the tables and then join them to the customer table - like so:
with customer as (select 1 customerid, 'bob' name from dual union all
select 2 customerid, 'ted' name from dual union all
select 3 customerid, 'joe' name from dual),
purchase1 as (select 1 customerid, 51 productid from dual union all
select 1 customerid, 52 productid from dual union all
select 2 customerid, 52 productid from dual),
purchase2 as (select 1 customerid, 81 productid from dual union all
select 1 customerid, 82 productid from dual union all
select 1 customerid, 83 productid from dual),
-- end of mimicking your table and data; main query is below:
purchases as (select customerid, productid productid1, null productid2
from purchase1
union all
select customerid, null productid1, productid productid2
from purchase2)
select c.customerid,
c.name,
p.productid1,
p.productid2
from customer c
inner join purchases p on (c.customerid = p.customerid)
order by c.customerid,
p.productid1,
p.productid2;
CUSTOMERID NAME PRODUCTID1 PRODUCTID2
---------- ---- ---------- ----------
1 bob 51
1 bob 52
1 bob 81
1 bob 82
1 bob 83
2 ted 52
It's probably easiest to just change it to a union query like this.
select customer.customerid, purchase1.productid as P1, null as P2
from customer
left join purchase1
on customer.customerid = purchase1.customerid
union all
select customer.customerid, null as P1, purchase2.productid as P2
from customer
left join purchase2
on customer.customerid = purchase2.customerid
where customer.customerid = 1;
This uses Union, but in a slightly different way, within subqueries, which might provide you more flexibility.
select distinct t1.pID,t2.pID
from (select ID,pID from Puchase1
union all
select ID, null from Purchase1) t1
right join (select ID,pID from Purchase2
union all
select ID, null from Purchase2) t2
on t1.ID = t2.ID
where t1.ID = 1
and (t1.pID is not null or t2.pID is not null)
and (t1.pID is null or t2.pID is null)

How can I avoiding Cartesian product on SQL on multiple tables

Here is my sqlfiddle http://sqlfiddle.com/#!3/671c8/1.
Here are my tables:
Person
PID LNAME FNAME
1 Bob Joe
2 Smith John
3 Johnson Jake
4 Doe Jane
Table1
PID VALUE
1 3
1 5
1 35
2 10
2 15
3 8
Table2
PID VALUE
1 X1
1 X2
1 X3
2 Z1
3 X3
I am trying to join several tables on a person's ID. These tables contain events with dates, but the dates may or may not match across table. So what I really want it to regardless of date join the tables in a way such that when I get results the table with the largest rows will be the amount of rows in my result and all other tables will "fit" within. For example
Instead of this which is a cartesian product:
PID LNAME FNAME THINGONE THINGTWO
1 Bob Joe 3 X1
1 Bob Joe 3 X2
1 Bob Joe 3 X3
1 Bob Joe 5 X1
1 Bob Joe 5 X2
1 Bob Joe 5 X3
1 Bob Joe 35 X1
1 Bob Joe 35 X2
1 Bob Joe 35 X3
I would like something like this:
PID LNAME FNAME THINGONE THINGTWO
1 Bob Joe 3 X1
1 Bob Joe 5 X2
1 Bob Joe 35 X3
My sql statement:
SELECT
p.*,
t1.value as thingone,
t2.value as thingtwo
FROM
person p
left outer join table1 t1 on p.pid=t1.pid
left outer join table2 t2 on p.pid=t2.pid
;
I can't fathom why you want to do this, but...
You need to create an artificial join between table1 and table2, and then link that to the master table. One way of doing that is by ranking the rows in order. eg:
SELECT
p.pid, p.lname,p.fname, thingone, thingtwo
FROM
person p
left outer join
(
select ISNULL(t1.pid, t2.pid) as pid, t1.value as thingone, t2.value as thingtwo
from
(select *, ROW_NUMBER() over (partition by pid order by value) rn
from table1) t1
full outer join
(select *, ROW_NUMBER() over (partition by pid order by value) rn
from table2) t2
on t1.pid=t2.pid and t1.rn=t2.rn
) v
on p.pid = v.pid
This is a trickier problem than I thought. The challenge is being sure that all the records appear, regardless of the lengths of the two lists. The following works by enumerating each of the lists and using that for the join conditions:
SELECT p.*,
t1.value as thingone,
t2.value as thingtwo
FROM person p left outer join
(select t1.*,
row_number() over (partition by pid order by pid) as seqnum,
count(*) over (partition by pid) as cnt
from table1 t1
) t1
on p.pid = t1.pid left outer join
(select t2.*, row_number() over (partition by pid order by pid) as seqnum,
count(*) over (partition by pid) as cnt
from table2 t2
) t2
on p.pid = t2.pid
WHERE t1.seqnum = t2.seqnum or
(t2.seqnum > t1.cnt) or
(t1.seqnum > t2.cnt) or
t1.seqnum is null or
t2.seqnum is null;
Here is a slight modification to your SQL Fiddle that has better test data.
EDIT:
The logic in the where clause handles these cases (in order by the clauses):
Where the two lists have sequence numbers, these must match.
Where list2 is longer and list1 has at least one element.
Where list1 is longer and list2 has at least one element.
Where list1 is empty
Where list 2 is empty
These were arrived at by trial and error, because the original condition did not work:
on p.pid = t2.pid and t1.seqnum = t2.seqnum
This returns NULL values for p.id for the extra elements on the list. Podliuska's approach may also work; I had just started down this path and the where conditions do the trick.