Convert SQL to LINQ for same table query - sql

I've been trying to write a linq query but the groupby performance is horrifically slow, so I wrote my query in SQL instead and it's really speady but I can't get linq pad to convert it to linq for me. Can any body help me convert this sql to Linq please:
(SELECT mm.rcount, * FROM
(SELECT m.TourID AS myId, COUNT(m.RecordType) AS rcount FROM
(
((SELECT *
FROM Bookings h
WHERE h.RecordType = 'H' AND h.TourArea like '%bull%')
union
(SELECT *
FROM Bookings t
WHERE t.RecordType = 'T' and t.TourGuideName like '%bull%'))
) m
group by m.TourID) mm
INNER JOIN Bookings b ON mm.myId= b.TourID
WHERE b.RecordType = 'H');
here's my LINQ effort but it takes like 20 seconds to iterate over 200 records:
var heads = from head in db.GetTable<BookingType>()
where head.RecordType == "H" &&
head.TourArea.Contains("bull")
select g;
var tgs = from tourguides in db.GetTable<BookingType>()
where tourguides.RecordType == "T" &&
tourguides.TourGuideName.Contains("bull")
select tourguides;
var all = heads.Union(tgs);
var groupedshit = from r in all
group r by r.BookingID into g
select g;
return heads;
Edit 1:
Here's my database structure:
BookingID [PK] | TourID | RecordType | TourArea | TourGuideName | ALoadOfOtherFields
And here's some sample data:
1 | 1 | H | Bullring | null
2 | 1 | T | null | Bulldog
3 | 2 | H | Bullring | null
4 | 2 | T | null | Bulldog
5 | 2 | T | null | bull stamp
There will only ever be a single H (head) record but could potentially have many T (tour guide) records. After the grouping if I select a new (like this question: How to use LINQ to SQL to create ranked search results?) on the .Contains('bull') with a .Count() I can then get ranked searching (which is the whole point of this exercise).
Edit 2:
I've added in a property for search rank in the class itself to avoid the problem of then converting my results into a key/value pair. I don't know if this is best practice but it works.
/// <summary>
/// Search Ranking
/// </summary>
public int? SearchRank { get; set; }
and then I execute a SQL query directly using linq-to-sql:
IEnumerable<BookingType> results = db.ExecuteQuery<BookingType>
("(SELECT mm.rcount AS SearchRank, b.* FROM (SELECT m.TourID AS myId, COUNT(m.RecordType) AS rcount FROM (((SELECT * FROM Bookings h WHERE h.RecordType = 'H' AND h.TourArea like '%{0}%') union (SELECT * FROM Bookings t WHERE t.RecordType = 'T' and t.TourGuideName like '%{0}%')) ) m group by m.TourID) mm INNER JOIN Bookings b ON mm.myId= b.TourID WHERE b.RecordType = 'H')", "bull");
I can add in as many 'AND's and 'OR's as I like now without Linq-to-sql going mental (the query it generated was a crazy 200 lines long!
Ranked Search viola!

You don't have to use union at all. you can use Where OR AND something like this should work:
var result= from b in DB.GetTable<Booking>()
where (b.recordType =="H" || b.recordType=="T")
&&b.TourArea.Contains("bull")
group b by b.Booking_Id into g
select g;

Why bother converting it? You can just call the SQl you have opptimized.

Related

Recursive query using postgreSQL

My dataBase contains data (Image for example) and this data can be modified by a program (Image processing for example) so I get a new image derived from the other, and this image could be modified as well, etc...
2 Images could also be used to create a new one, for example: image a + image b = image c
So in my dataBase I have a table call "Derived from" which contains 2 columns (previous_id, new_id), previous_id is the image before an image processing and new_id is the result. So I can have a "change history" like this:
+------------------+------------------+
| id_previous | id_new |
+------------------+------------------+
| a | c |
| b | c |
| c | d |
| d | e |
+------------------+------------------+
So my questions is:
Is it possible to make a recursive query to have all the history of an data ID ?
Something like this:
Select * from derived_from where id_new = 'e'
Should return (d,c,b,a)
Thank you for your help
Yes, you can achieve this with a recursive CTE:
with recursive r as (
select id_previous
from derived_from
where id_new = 'e'
union
select d.id_previous
from derived_from d
join r on id_new = r.id_previous
)
select id_previous
from r
http://rextester.com/NZKT73800
Notes:
UNION can stop the recursion even when you have loops. With UNION ALL, you should handle loops yourself, unless you are really sure you have no loops.
This will give you separate rows (one for each "ascendant"). You can aggregate this too, but it's typically much more easier to consume than comma separated lists or arrays.
You can use a recursive CTE:
with recursive cte as (
select df.id_new, df.id_previous as parent
from derived_from df
where df.id_new = 'e'
union all
select cte.id_new, df.id_previous
from cte join
derived_from df
on cte.parent = df.id_new
)
select id_new, array_agg(parent)
from cte
group by id_new;

SQL - Computing overlap between Interests

I have a schema (millions of records with proper indexes in place) that looks like this:
groups | interests
------ | ---------
user_id | user_id
group_id | interest_id
A user can like 0..many interests and belong to 0..many groups.
Problem: Given a group ID, I want to get all the interests for all the users that do not belong to that group, and, that share at least one interest with anyone that belongs to the same provided group.
Since the above might be confusing, here's a straightforward example (SQLFiddle):
| 1 | 2 | 3 | 4 | 5 | (User IDs)
|-------------------|
| A | | A | | |
| B | B | B | | B |
| | C | | | |
| | | D | D | |
In the above example users are labeled with numbers while interests have characters.
If we assume that users 1 and 2 belong to group -1, then users 3 and 5 would be interesting:
user_id interest_id
------- -----------
3 A
3 B
3 D
5 B
I already wrote a dumb and very inefficient query that correctly returns the above:
SELECT * FROM "interests" WHERE "user_id" IN (
SELECT "user_id" FROM "interests" WHERE "interest_id" IN (
SELECT "interest_id" FROM "interests" WHERE "user_id" IN (
SELECT "user_id" FROM "groups" WHERE "group_id" = -1
)
) AND "user_id" NOT IN (
SELECT "user_id" FROM "groups" WHERE "group_id" = -1
)
);
But all my attempts to translate that into a proper joined query revealed themselves fruitless: either the query returns way more rows than it should or it just takes 10x as long as the sub-query, like:
SELECT "iii"."user_id" FROM "interests" AS "iii"
WHERE EXISTS
(
SELECT "ii"."user_id", "ii"."interest_id" FROM "groups" AS "gg"
INNER JOIN "interests" AS "ii" ON "gg"."user_id" = "ii"."user_id"
WHERE EXISTS
(
SELECT "i"."interest_id" FROM "groups" AS "g"
INNER JOIN "interests" AS "i" ON "g"."user_id" = "i"."user_id"
WHERE "group_id" = -1 AND "i"."interest_id" = "ii"."interest_id"
) AND "group_id" != -1 AND "ii"."user_id" = "iii"."user_id"
);
I've been struggling trying to optimize this query for the past two nights...
Any help or insight that gets me in the right direction would be greatly appreciated. :)
PS: Ideally, one query that returns an aggregated count of common interests would be even nicer:
user_id totalInterests commonInterests
------- -------------- ---------------
3 3 1/2 (either is fine, but 2 is better)
5 1 1
However, I'm not sure how much slower it would be compared to doing it in code.
Using the following to set up test tables
--drop table Interests ----------------------------
CREATE TABLE Interests
(
InterestId char(1) not null
,UserId int not null
)
INSERT Interests values
('A',1)
,('A',3)
,('B',1)
,('B',2)
,('B',3)
,('B',5)
,('C',2)
,('D',3)
,('D',4)
-- drop table Groups ---------------------
CREATE TABLE Groups
(
GroupId int not null
,UserId int not null
)
INSERT Groups values
(-1, 1)
,(-1, 2)
SELECT * from Groups
SELECT * from Groups
The following query would appear to do what you want:
DECLARE #GroupId int
SET #GroupId = -1
;WITH cteGroupInterests (InterestId)
as (-- List of the interests referenced by the target group
select distinct InterestId
from Groups gr
inner join Interests nt
on nt.UserId = gr.UserId
where gr.GroupId = #GroupId)
-- Aggregate interests for each user
SELECT
UserId
,count(OwnInterstId) OwnInterests
,count(SharedInterestId) SharedInterests
from (-- Subquery lists all interests for each user
select
nt.UserId
,nt.InterestId OwnInterstId
,cte.InterestId SharedInterestId
from Interests nt
left outer join cteGroupInterests cte
on cte.InterestId = nt.InterestId
where not exists (-- Correlated subquery: is "this" user in the target group?)
select 1
from Groups gr
where gr.GroupId = #GroupId
and gr.UserId = nt.UserId)) xx
group by UserId
having count(SharedInterestId) > 0
It appears to work, but I'd want to do more elaborate tests, and I've no idea how well it'd work against millions of rows. Key points are:
cte creates a temp table referenced by the later query; building an actual temp table might be a performance boost
Correlated subqueries can be tricky, but indexes and not exists should make this pretty quick
I was lazy and left out all the underscores, sorry
This is a bit confounding. I think the best approach is exists and not exists:
select i.*
from interest i
where not exists (select 1
from groups g
where i.user_id = g.user_id and
g.group_id = $group_id
) and
exists (select 1
from groups g join
interest i2
on g.user_id = i2.user_id
where g.user_id <> i.user_user_id and
i.interest_id = i2.interest_id
);
The first subquery is saying that the user is not in the group. The second is saying that the interest is shared with someone who is in the group.

Query to fetch records from 2 diff table into 2 columns

I have 2 table like below :
1)
Engine
======
ID Title Unit Value
123 Hello Inch 50
555 Hii feet 60
2)
Fuel
=====
ID Title Value
123 test12 343
555 test5556 777
I want the select result in 2 columns as per the ID given (ID should be same in both tables) :
Title -- This will get the (Title + Unit) from Engine table and only
Title from Fuel table. Value
Value-- This will get Value from both tables.
Result for ID = 123 is :
Title Value
Hello(Inch) 50
test12 343
Any suggestion how I can get this in SQL server 2008.
Based on your same data and the desired result, it appears that you want to use a UNION ALL to get the data from both tables:
select title+'('+Unit+')' Title, value
from engine
where id = 123
union all
select title, value
from fuel
where id = 123
See SQL Fiddle with Demo
The result of the query is:
| TITLE | VALUE |
-----------------------
| Hello(Inch) | 50 |
| test12 | 343 |
Look at SQL JOINs: INNER JOIN, LEFT JOIN etc
Select
e.ID, e.Title, e.Unit, e.Value, f.Title as FuelTitle, e.Value as FuelValue,
e.Title+' '+e.Units As TitleAndUnits
From Engine e
Join Fuel f
On e.ID = f.ID
You can do this w/o a join but with join it may be more optimal depending on other factors in your case.
Example w/o join:
select concat(t1.c1, ' ', t1.c2, ' ', t2.c1) col1, concat(t1.c3, ' ', t2.c3) col2
from t1, t2
where t1.id = [ID] and t2.id = [ID]
You should probably have a look at something like Introduction to JOINs – Basic of JOINs and read up a little on JOINS
Join Fundamentals
SQL Server Join Example
SQL Joins
EDIT
Maybe then also look at
CASE (Transact-SQL)
+ (String Concatenation) (Transact-SQL)
CAST and CONVERT (Transact-SQL)

I'm trying to convert a SQL query to linq, but I'm getting stuck on combining group by and max()

What I'm trying to do is find the last ID key column for each SerialNumber column within a table.
Here's the sample tables, truncated to just the columns I need to answer this question.
Devices Table
SerialNumber
12345
45678
67890
History Table
ID | SerialNumber
1 | 12345
2 | 45678
3 | 67890
4 | 12345
5 | 67890
My expected output would be:
ID | SerialNumber
2 | 45678
4 | 12345
5 | 67890
Here's the SQL I'm trying to convert from:
SELECT max(ID) as ID, SerialNumber
FROM History
WHERE SerialNumber in
(SELECT Distinct SerialNumber
FROM Devices
WHERE DeviceType = 0)
GROUP BY SerialNumber)
Here's the best working linq query I've got:
var query= from ListOfIDs in History
where (from distinctSerialNubers in Devices
where distinctSerialNubers.DeviceType == 0
select distinctSerialNubers.SerialNumber).Contains(ListOfIDs.SerialNumber)
select new {ListOfIDs.ID, ListOfIDs.SerialNumber};
Using .Max() will return the last ID in the History table. Trying to group ListOfIDs by ListOfIDs.SerialNumber will separate all ids by SerialNumber, but I can't seem to find a way to get into the groups to run .Max().
Edit:
Thank you for the answers. This combination of the two answers works well for me.
var query= from historyEntry in History
where (from s in Devices where s.DeviceType==0 select s.SerialNumber).Contains(historyEntry.SerialNumber)
group historyEntry by historyEntry.SerialNumber into historyGroup
select historyGroup.Max(entry => entry.ID);
Thank you both, I've been trying to figure this one out all day.
Haven't compiled/tested it but this should be what you're looking for:
from historyEntry in db.History
join device in db.Devices on history.SerialNumber equals device.SerialNumber
where device.DeviceType == 0
group historyEntry by historyEntry.SerialNumber into historyGroup
select new { ID = historyGroup.Max(entry => entry.ID), SerialNumber = historyGroup.Key }
What about :
var snoQuery = (from s in Devices where s.DeviceType==0 select s.SerialNumber).Distinct();
var query (
from h in History
where snoQuery.Contains(h.SerialNumber)
group h by h.SerialNumber into results
select new { Id = results.Max(r=>r.id) , SerialNumber = results.Key}
);

SQLite query three tables

I'm working on an Android app that will display some information from a sqlite database in a listview. I need some help sorting out my query.
The database looks like this:
[monitors] 1 --- <has> --- * [results] 1 --- <has> --- * [steps]
Table monitors has columns: _id | warning_threshold | alarm_threshold | monitor_name
Table results has columns: _id | monitor_id | timestamp | test_info
Table steps has columns: _id | result_id | correct | response_time
I'm trying to make a query that would return:
1) All rows & columns from the monitors table.
2) The newest test_info for each monitor from the results table.
3) Count the number of correct = true for each result from the steps table.
The returned cursor should look something like this:
_id | monitor_name | warning_threshold | alarm_threshold | test_info | correct_count
1 | 'hugo' | 1000 | 1500 | 'some info' | 7
2 | 'kurt' | 800 | 1200 | 'info.....' | 5
My query:
SELECT * FROM
(SELECT monitors._id AS _id,
monitors.monitor_name AS monitor_name,
monitors.warning_threshold AS warning_threshold,
monitors.alarm_threshold AS alarm_threshold,
results.test_info AS test_info
FROM monitors
LEFT JOIN results
ON monitors._id = results.monitor_id
ORDER BY results.timestamp ASC) AS inner
GROUP BY inner._id;
I almost got it working. I am able to get the info from monitors and results, I still need to get the correct_count. Any help with sorting out this query would be greatly appreciated.
This is my approach, using a combination of Left Joins, sub queries, and correlated subqueries:
SELECT monitors._id AS _id,
monitors.monitor_name AS monitor_name,
monitors.warning_threshold AS warning_threshold,
monitors.alarm_threshold AS alarm_threshold,
LastResults.test_info AS test_info,
COUNT(CorrectSteps._id) AS correct_count
FROM monitors
LEFT JOIN
(SELECT * FROM results as r1 where timestamp =
(SELECT Max(r2.timestamp) FROM results AS r2 WHERE r1.monitor_id=r2.monitor_id)) LastResults
ON monitors._id = LastResults.monitor_id
LEFT JOIN
(SELECT * FROM steps WHERE correct = 'true') CorrectSteps
ON LastResults._id = CorrectSteps.result_id
GROUP BY monitors._id;
Something like this should work. I haven't been able to test it out but hopefully it will at least get you started. Note that this query is not even close to optimized. Wrote it quickly during my lunch :)
SELECT m._id,
m.monitor_name,
m.warning_threshold,
m.alarm_threshold,
(SELECT r.test_info
FROM results r
WHERE r.monitor_id = m._id
ORDER BY r.timestamp ASC
LIMIT 1) as 'test_info',
(SELECT COUNT(_id)
FROM steps s
WHERE s.result_id IN (SELECT _id FROM results WHERE monitor_id = m._id)
AND s.correct = 'true') as 'correct_count'
FROM monitor m