SQLite query three tables

SQLite query three tables - sql

I'm working on an Android app that will display some information from a sqlite database in a listview. I need some help sorting out my query.
The database looks like this:
[monitors] 1 --- <has> --- * [results] 1 --- <has> --- * [steps]
Table monitors has columns: _id | warning_threshold | alarm_threshold | monitor_name
Table results has columns: _id | monitor_id | timestamp | test_info
Table steps has columns: _id | result_id | correct | response_time
I'm trying to make a query that would return:
1) All rows & columns from the monitors table.
2) The newest test_info for each monitor from the results table.
3) Count the number of correct = true for each result from the steps table.
The returned cursor should look something like this:
_id | monitor_name | warning_threshold | alarm_threshold | test_info | correct_count
1 | 'hugo' | 1000 | 1500 | 'some info' | 7
2 | 'kurt' | 800 | 1200 | 'info.....' | 5
My query:
SELECT * FROM
(SELECT monitors._id AS _id,
monitors.monitor_name AS monitor_name,
monitors.warning_threshold AS warning_threshold,
monitors.alarm_threshold AS alarm_threshold,
results.test_info AS test_info
FROM monitors
LEFT JOIN results
ON monitors._id = results.monitor_id
ORDER BY results.timestamp ASC) AS inner
GROUP BY inner._id;
I almost got it working. I am able to get the info from monitors and results, I still need to get the correct_count. Any help with sorting out this query would be greatly appreciated.

This is my approach, using a combination of Left Joins, sub queries, and correlated subqueries:
SELECT monitors._id AS _id,
monitors.monitor_name AS monitor_name,
monitors.warning_threshold AS warning_threshold,
monitors.alarm_threshold AS alarm_threshold,
LastResults.test_info AS test_info,
COUNT(CorrectSteps._id) AS correct_count
FROM monitors
LEFT JOIN
(SELECT * FROM results as r1 where timestamp =
(SELECT Max(r2.timestamp) FROM results AS r2 WHERE r1.monitor_id=r2.monitor_id)) LastResults
ON monitors._id = LastResults.monitor_id
LEFT JOIN
(SELECT * FROM steps WHERE correct = 'true') CorrectSteps
ON LastResults._id = CorrectSteps.result_id
GROUP BY monitors._id;

Something like this should work. I haven't been able to test it out but hopefully it will at least get you started. Note that this query is not even close to optimized. Wrote it quickly during my lunch :)
SELECT m._id,
m.monitor_name,
m.warning_threshold,
m.alarm_threshold,
(SELECT r.test_info
FROM results r
WHERE r.monitor_id = m._id
ORDER BY r.timestamp ASC
LIMIT 1) as 'test_info',
(SELECT COUNT(_id)
FROM steps s
WHERE s.result_id IN (SELECT _id FROM results WHERE monitor_id = m._id)
AND s.correct = 'true') as 'correct_count'
FROM monitor m

Related

Getting the latest entry per day / SQL Optimizing

Given the following database table, which records events (status) for different objects (id) with its timestamp:
ID | Date | Time | Status
-------------------------------
7 | 2016-10-10 | 8:23 | Passed
7 | 2016-10-10 | 8:29 | Failed
7 | 2016-10-13 | 5:23 | Passed
8 | 2016-10-09 | 5:43 | Passed
I want to get a result table using plain SQL (MS SQL) like this:
ID | Date | Status
------------------------
7 | 2016-10-10 | Failed
7 | 2016-10-13 | Passed
8 | 2016-10-09 | Passed
where the "status" is the latest entry on a day, given that at least one event for this object has been recorded.
My current solution is using "Outer Apply" and "TOP(1)" like this:
SELECT DISTINCT rn.id,
tmp.date,
tmp.status
FROM run rn OUTER apply
(SELECT rn2.date, tmp2.status AS 'status'
FROM run rn2 OUTER apply
(SELECT top(1) rn3.id, rn3.date, rn3.time, rn3.status
FROM run rn3
WHERE rn3.id = rn.id
AND rn3.date = rn2.date
ORDER BY rn3.id ASC, rn3.date + rn3.time DESC) tmp2
WHERE tmp2.status <> '' ) tmp
As far as I understand this outer apply command works like:
For every id
For every recorded day for this id
Select the newest status for this day and this id
But I'm facing performance issues, therefore I think that this solution is not adequate. Any suggestions how to solve this problem or how to optimize the sql?

Your code seems too complicated. Why not just do this?
SELECT r.id, r.date, r2.status
FROM run r OUTER APPLY
(SELECT TOP 1 r2.*
FROM run r2
WHERE r2.id = r.id AND r2.date = r.date AND r2.status <> ''
ORDER BY r2.time DESC
) r2;
For performance, I would suggest an index on run(id, date, status, time).

Using a CTE will probably be the fastest:
with cte as
(
select ID, Date, Status, row_number() over (partition by ID, Date order by Time desc) rn
from run
)
select ID, Date, Status
from cte
where rn = 1

Do not SELECT from a log table, instead, write a trigger that updates a latest_run table like:
CREATE TRIGGER tr_run_insert ON run FOR INSERT AS
BEGIN
UPDATE latest_run SET Status=INSERTED.Status WHERE ID=INSERTED.ID AND Date=INSERTED.Date
IF ##ROWCOUNT = 0
INSERT INTO latest_run (ID,Date,Status) SELECT (ID,Date,Status) FROM INSERTED
END
Then perform reads from the much shorter lastest_run table.
This will add a performance penalty on writes because you'll need two writes instead of one. But will give you much more stable response times on read. And if you do not need to SELECT from "run" table you can avoid indexing it, therefore the performance penalty of two writes is partly compensated by less indexes maintenance.

Self-referencing Query and Not Equals

Trying to pull data from a single table called tblTooling where two TlPartNo numbers are equal to different values and the TlToolNo are not equal for these TlPartNo . This is an Access DB and the following statement gets me close, but still gives too much data.
SELECT DISTINCT
tblTooling.TlToolNo,
tblTooling.TlPartNo,
tblTooling.TlOP,
tblTooling.TlQuantity
FROM tblTooling, tblTooling AS tblTooling_1
WHERE (((tblTooling.TlToolNo)<>tblTooling_1.TlToolNo)
AND ((tblTooling.TlPartNo)="10290722")
AND ((tblTooling_1.TlPartNo)="10295379"));
The included image has the tblTooling structure and Data. Plus the expected results from the query.

You seem to want exclude a ToolNo value when it occurs with both PartNo values. In that case you could group intermediate results by ToolNo, and see whether in such a group there is only one PartNo present (with having). In that case keep that record, and in the outer query, get the two other columns added to it:
SELECT DISTINCT
tblTooling.TlToolNo,
tblTooling.TlPartNo,
tblTooling.TlOP,
tblTooling.TlQuantity
FROM tblTooling
INNER JOIN (
SELECT TlToolNo,
Min(TlPartNo) AS MinTlPartNo,
Max(TlPartNo) AS MaxTlPartNo
FROM tblTooling
WHERE TlPartNo IN ("10290722", "10295379")
GROUP BY TlToolNo
HAVING Min(TlPartNo) = Max(TlPartNo)
) AS grp
ON grp.TlToolNo = tblTooling.TlToolNo
AND grp.MinTlPartNo = tblTooling.TlPartNo
Note that for your sample data this will return 4 rows:
TlToolNo | TlPartNo | TlOP | TlQuantity
----------+----------+------+-----------
T00012362 | 10290722 | OP10 | 2
T00012456 | 10290722 | OP10 | 1
T00013456 | 10290722 | OP20 | 1
T00014348 | 10295379 | OP20 | 1

I think you can do this with not exists:
select t.*
from tblTooling as t
where not exists (select 1
from tblTooling as t2
where t2.TlPartNo in ("10290722", "10295379") and
t2.TlToolNo = t.TlToolNo and
t2.tiid <> t.tiid
) and
t.TlPartNo in ("10290722", "10295379");
This saves on the select distinct, which should be a performance boost.

SQL - Computing overlap between Interests

I have a schema (millions of records with proper indexes in place) that looks like this:
groups | interests
------ | ---------
user_id | user_id
group_id | interest_id
A user can like 0..many interests and belong to 0..many groups.
Problem: Given a group ID, I want to get all the interests for all the users that do not belong to that group, and, that share at least one interest with anyone that belongs to the same provided group.
Since the above might be confusing, here's a straightforward example (SQLFiddle):
| 1 | 2 | 3 | 4 | 5 | (User IDs)
|-------------------|
| A | | A | | |
| B | B | B | | B |
| | C | | | |
| | | D | D | |
In the above example users are labeled with numbers while interests have characters.
If we assume that users 1 and 2 belong to group -1, then users 3 and 5 would be interesting:
user_id interest_id
------- -----------
3 A
3 B
3 D
5 B
I already wrote a dumb and very inefficient query that correctly returns the above:
SELECT * FROM "interests" WHERE "user_id" IN (
SELECT "user_id" FROM "interests" WHERE "interest_id" IN (
SELECT "interest_id" FROM "interests" WHERE "user_id" IN (
SELECT "user_id" FROM "groups" WHERE "group_id" = -1
)
) AND "user_id" NOT IN (
SELECT "user_id" FROM "groups" WHERE "group_id" = -1
)
);
But all my attempts to translate that into a proper joined query revealed themselves fruitless: either the query returns way more rows than it should or it just takes 10x as long as the sub-query, like:
SELECT "iii"."user_id" FROM "interests" AS "iii"
WHERE EXISTS
(
SELECT "ii"."user_id", "ii"."interest_id" FROM "groups" AS "gg"
INNER JOIN "interests" AS "ii" ON "gg"."user_id" = "ii"."user_id"
WHERE EXISTS
(
SELECT "i"."interest_id" FROM "groups" AS "g"
INNER JOIN "interests" AS "i" ON "g"."user_id" = "i"."user_id"
WHERE "group_id" = -1 AND "i"."interest_id" = "ii"."interest_id"
) AND "group_id" != -1 AND "ii"."user_id" = "iii"."user_id"
);
I've been struggling trying to optimize this query for the past two nights...
Any help or insight that gets me in the right direction would be greatly appreciated. :)
PS: Ideally, one query that returns an aggregated count of common interests would be even nicer:
user_id totalInterests commonInterests
------- -------------- ---------------
3 3 1/2 (either is fine, but 2 is better)
5 1 1
However, I'm not sure how much slower it would be compared to doing it in code.

Using the following to set up test tables
--drop table Interests ----------------------------
CREATE TABLE Interests
(
InterestId char(1) not null
,UserId int not null
)
INSERT Interests values
('A',1)
,('A',3)
,('B',1)
,('B',2)
,('B',3)
,('B',5)
,('C',2)
,('D',3)
,('D',4)
-- drop table Groups ---------------------
CREATE TABLE Groups
(
GroupId int not null
,UserId int not null
)
INSERT Groups values
(-1, 1)
,(-1, 2)
SELECT * from Groups
SELECT * from Groups
The following query would appear to do what you want:
DECLARE #GroupId int
SET #GroupId = -1
;WITH cteGroupInterests (InterestId)
as (-- List of the interests referenced by the target group
select distinct InterestId
from Groups gr
inner join Interests nt
on nt.UserId = gr.UserId
where gr.GroupId = #GroupId)
-- Aggregate interests for each user
SELECT
UserId
,count(OwnInterstId) OwnInterests
,count(SharedInterestId) SharedInterests
from (-- Subquery lists all interests for each user
select
nt.UserId
,nt.InterestId OwnInterstId
,cte.InterestId SharedInterestId
from Interests nt
left outer join cteGroupInterests cte
on cte.InterestId = nt.InterestId
where not exists (-- Correlated subquery: is "this" user in the target group?)
select 1
from Groups gr
where gr.GroupId = #GroupId
and gr.UserId = nt.UserId)) xx
group by UserId
having count(SharedInterestId) > 0
It appears to work, but I'd want to do more elaborate tests, and I've no idea how well it'd work against millions of rows. Key points are:
cte creates a temp table referenced by the later query; building an actual temp table might be a performance boost
Correlated subqueries can be tricky, but indexes and not exists should make this pretty quick
I was lazy and left out all the underscores, sorry

This is a bit confounding. I think the best approach is exists and not exists:
select i.*
from interest i
where not exists (select 1
from groups g
where i.user_id = g.user_id and
g.group_id = $group_id
) and
exists (select 1
from groups g join
interest i2
on g.user_id = i2.user_id
where g.user_id <> i.user_user_id and
i.interest_id = i2.interest_id
);
The first subquery is saying that the user is not in the group. The second is saying that the interest is shared with someone who is in the group.

Normalising a table in Oracle SQL

Was wondering if someone could assist in providing some guidance as to how I could most efficiently normalize the following table so that I can create a refreshable view / table.
Table1:
SYSTEM_KEY | ID | ORDER | ORDER_STATUS | SYSTEM_Actions
A 1 Pencil Open Shipped
B 1 Pencil Open Tested
C 1 Pencil Open Shipped
A 1 Paper Closed Delivered
I'am looking to normalize this table in a repeatable way to something like this:
RESULT:
ID | ORDER | Order Status | A_actions | B_Actions | C_Actions
1 Pencil OPEN Shipped Tested Delivered
1 Paper Closed Delivered null null
I was able to achieve this by doing something similar to this
Select full.ID, full.order, full.orderstatus, case when system_ID = 'A' then sysa.system_actions as A_actions, ....{for B, C}
from table1 full
left join table1 sysa on full.id = sysa.id and full.order = sysa.order
left join table1 sysb on full.id = sysb.id and full.order = sysb.order
Whilst this appeared to work, it was quite clunky in terms of being repeatable having to use several staging tables.
Does anyone know if a good way I can achieve this?

try using group by clause
select id,order,order_status,
case when system_ID = 'A' then system_actions as A_actions,
case when system_ID = 'B' then system_actions as B_actions,
case when system_ID = 'C' then system_actions as C_actions
from table1
group by id,order,order_status

Convert SQL to LINQ for same table query

I've been trying to write a linq query but the groupby performance is horrifically slow, so I wrote my query in SQL instead and it's really speady but I can't get linq pad to convert it to linq for me. Can any body help me convert this sql to Linq please:
(SELECT mm.rcount, * FROM
(SELECT m.TourID AS myId, COUNT(m.RecordType) AS rcount FROM
(
((SELECT *
FROM Bookings h
WHERE h.RecordType = 'H' AND h.TourArea like '%bull%')
union
(SELECT *
FROM Bookings t
WHERE t.RecordType = 'T' and t.TourGuideName like '%bull%'))
) m
group by m.TourID) mm
INNER JOIN Bookings b ON mm.myId= b.TourID
WHERE b.RecordType = 'H');
here's my LINQ effort but it takes like 20 seconds to iterate over 200 records:
var heads = from head in db.GetTable<BookingType>()
where head.RecordType == "H" &&
head.TourArea.Contains("bull")
select g;
var tgs = from tourguides in db.GetTable<BookingType>()
where tourguides.RecordType == "T" &&
tourguides.TourGuideName.Contains("bull")
select tourguides;
var all = heads.Union(tgs);
var groupedshit = from r in all
group r by r.BookingID into g
select g;
return heads;
Edit 1:
Here's my database structure:
BookingID [PK] | TourID | RecordType | TourArea | TourGuideName | ALoadOfOtherFields
And here's some sample data:
1 | 1 | H | Bullring | null
2 | 1 | T | null | Bulldog
3 | 2 | H | Bullring | null
4 | 2 | T | null | Bulldog
5 | 2 | T | null | bull stamp
There will only ever be a single H (head) record but could potentially have many T (tour guide) records. After the grouping if I select a new (like this question: How to use LINQ to SQL to create ranked search results?) on the .Contains('bull') with a .Count() I can then get ranked searching (which is the whole point of this exercise).
Edit 2:
I've added in a property for search rank in the class itself to avoid the problem of then converting my results into a key/value pair. I don't know if this is best practice but it works.
/// <summary>
/// Search Ranking
/// </summary>
public int? SearchRank { get; set; }
and then I execute a SQL query directly using linq-to-sql:
IEnumerable<BookingType> results = db.ExecuteQuery<BookingType>
("(SELECT mm.rcount AS SearchRank, b.* FROM (SELECT m.TourID AS myId, COUNT(m.RecordType) AS rcount FROM (((SELECT * FROM Bookings h WHERE h.RecordType = 'H' AND h.TourArea like '%{0}%') union (SELECT * FROM Bookings t WHERE t.RecordType = 'T' and t.TourGuideName like '%{0}%')) ) m group by m.TourID) mm INNER JOIN Bookings b ON mm.myId= b.TourID WHERE b.RecordType = 'H')", "bull");
I can add in as many 'AND's and 'OR's as I like now without Linq-to-sql going mental (the query it generated was a crazy 200 lines long!
Ranked Search viola!

You don't have to use union at all. you can use Where OR AND something like this should work:
var result= from b in DB.GetTable<Booking>()
where (b.recordType =="H" || b.recordType=="T")
&&b.TourArea.Contains("bull")
group b by b.Booking_Id into g
select g;

Why bother converting it? You can just call the SQl you have opptimized.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQLite query three tables - sql

Related

Getting the latest entry per day / SQL Optimizing

Self-referencing Query and Not Equals

SQL - Computing overlap between Interests

Normalising a table in Oracle SQL

Convert SQL to LINQ for same table query

Categories

Resources