Sql Left or Right Join One To Many Pagination - sql

I have one main table and join other tables via left outer or right outer outer join.One row of main table have over 30 row in join query as result. And I try pagination. But the problem is I can not know how many rows will it return for one main table row result.
Example :
Main table first row result is in my query 40 rows.
Main table second row result is 120 row.
Problem(Question) UPDATE:
For pagination I need give the pagesize the count of select result. But I can not know the right count for my select result. Example I give page no 1 and pagesize 50, because of this I cant get the right result.I need give the right pagesize for my main table top 10 result. Maybe for top 10 row will the result row count 200 but my page size is 50 this is the problem.
I am using Sql 2014. I need it for my ASP.NET project but is not important.
Sample UPDATE :
it is like searching an hotel for booking. Your main table is hotel table. And the another things are (mediatable)images, (mediatable)videos, (placetable)location and maybe (commenttable)comments they are more than one rows and have one to many relationship for the hotel. For one hotel the result will like 100, 50 or 10 rows for this all info. And I am trying to paginate this hotels result. I need get always 20 or 30 or 50 hotels for performance in my project.
Sample Query UPDATE :
SELECT
*
FROM
KisiselCoach KC
JOIN WorkPlace WP
ON KC.KisiselCoachId = WP.WorkPlaceOwnerId
JOIN Album A
ON KC.KisiselCoachId = A.AlbumId
JOIN Media M
ON A.AlbumId = M.AlbumId
LEFT JOIN Rating R
ON KC.KisiselCoachId = R.OylananId
JOIN FrUser Fr
ON KC.CoachId = Fr.UserId
JOIN UserJob UJ
ON KC.KisiselCoachId = UJ.UserJobOwnerId
JOIN Job J
ON UJ.JobId = J.JobId
JOIN UserExpertise UserEx
ON KC.KisiselCoachId = UserEx.UserExpertiseOwnerId
JOIN Expertise Ex
ON UserEx.ExpertiseId = Ex.ExpertiseId
Hotel Table :
HotelId HotelName
1 Barcelona
2 Berlin
Media Table :
MediaID MediaUrl HotelId
1 www.xxx.com 1
2 www.xxx.com 1
3 www.xxx.com 1
4 www.xxx.com 1
Location Table :
LocationId Adress HotelId
1 xyz, Berlin 1
2 xyz, Nice 1
3 xyz, Sevilla 1
4 xyz, Barcelona 1
Comment Table :
CommentId Comment HotelId
1 you are cool 1
2 you are great 1
3 you are bad 1
4 hmm you are okey 1
This is only sample! I have 9999999 hotels in my database. Imagine a hotel maybe it has 100 images maybe zero. I can not know this. And I need get 20 hotels in my result(pagination). But 20 hotels means 1000 rows maybe or 100 rows.

First, your query is poorly written for readability flow / relationship of tables. I have updated and indented to try and show how/where tables related in hierarchical relativity.
You also want to paginate, lets get back to that. Are you intending to show every record as a possible item, or did you intend to show a "parent" level set of data... Ex so you have only one instance per Media, Per User, or whatever, then once that entry is selected you would show details for that one entity? if so, I would do a query of DISTINCT at the top-level, or at least grab the few columns with a count(*) of child records it has to show at the next level.
Also, mixing inner, left and right joins can be confusing. Typically a right-join means you want the records from the right-table of the join. Could this be rewritten to have all required tables to the left, and non-required being left-join TO the secondary table?
Clarification of all these relationships would definitely help along with the context you are trying to get out of the pagination. I'll check for comments, but if lengthy, I would edit your original post question with additional details vs a long comment.
Here is my SOMEWHAT clarified query rewritten to what I THINK the relationships are within your database. Notice my indentations showing where table A -> B -> C -> D for readability. All of these are (INNER) JOINs indicating they all must have a match between all respective tables. If some things are NOT always there, they would be changed to LEFT JOINs
SELECT
*
FROM
KisiselCoach KC
JOIN WorkPlace WP
ON KC.KisiselCoachId = WP.WorkPlaceOwnerId
JOIN Album A
ON KC.KisiselCoachId = A.AlbumId
JOIN Media M
ON A.AlbumId = M.AlbumId
LEFT JOIN Rating R
ON KC.KisiselCoachId = R.OylananId
JOIN FrUser Fr
ON KC.CoachId = Fr.UserId
JOIN UserJob UJ
ON KC.KisiselCoachId = UJ.UserJobOwnerId
JOIN Job J
ON UJ.JobId = J.JobId
JOIN UserExpertise UserEx
ON KC.KisiselCoachId = UserEx.UserExpertiseOwnerId
JOIN Expertise Ex
ON UserEx.ExpertiseId = Ex.ExpertiseId
Readability of a query is a BIG help for yourself, and/or anyone assisting or following you. By not having the "on" clauses near the corresponding joins can be very confusing to follow.
Also, which is your PRIMARY table where the rest are lookup reference tables.
ADDITION PER COMMENT
Ok, so I updated a query which appears to have no context to the sample data and what you want in your post. That said, I would start with a list of hotels only and a count(*) of things per hotel so you can give SOME indication of how much stuff you have in detail. Something like
select
H.HotelID,
H.HotelName,
coalesce( MedSum.recs, 0 ) as MediaItems,
coalesce( LocSum.recs, 0 ) as NumberOfLocations,
coalesce( ComSum.recs, 0 ) as NumberOfLocations
from
Hotel H
LEFT JOIN
( select M.HotelID,
count(*) recs
from Media M
group by M.HotelID ) MedSum
on H.HotelID = MedSum.HotelID
LEFT JOIN
( select L.HotelID,
count(*) recs
from Location L
group by L.HotelID ) LocSum
on H.HotelID = LocSum.HotelID
LEFT JOIN
( select C.HotelID,
count(*) recs
from Comment C
group by C.HotelID ) ComSum
on H.HotelID = ComSum.HotelID
order by
H.HotelName
--- apply any limit per pagination
Now this will return every hotel at a top-level and the total count of things per the hotel per the individual counts which may or not exist hence each sub-check is a LEFT-JOIN. Expose a page of 20 different hotels. Now, as soon as one person picks a single hotel, you can then drill-into the locations, media and comments per that one hotel.
Now, although this COULD work, having to do these counts on an every-time query might get very time consuming. You might want to add counter columns to your main hotel table representing such counts as being performed here. Then, via some nightly process, you could re-update the counts ONCE to get them primed across all history, then update counts only for those hotels that have new activity since entered the date prior. Not like you are going to have 1,000,000 posts of new images, new locations, new comments in a day, but of 22,000, then those are the only hotel records you would re-update counts for. Each incremental cycle would be short based on only the newest entries added. For the web, having some pre-aggregate counts, sums, etc is a big time saver where practical.

Related

Having SQL Server choose and show one record over other

Ok, hopefully I can explain this accurately. I work in SQL Server, and I am trying to get one row from a table that will show multiple rows for the same person for various reasons.
There is a column called college_attend which will show either New or Cont for each student.
My issue: my initial query narrows down the rows I'm pulling by Academic Year, which consists of two semesters: Fall of one year, and Spring of the following to create an academic year. This is why there are two rows returned for some students.
Basically, I need to generate an accurate count of those that are "New" and those that are "Cont", but I don't want both records for the same student counted. They will have two records because they will have one for spring and one for fall (usually). So if a student is "New" in fall, they will have a "Cont" record for spring. I want the query to show ONLY the "New" record if they have both a "New' and "Cont" record, and count it (which I will do in Report Builder). The other students will basically have two records that are "Cont": one for fall, and one "Cont" for spring, and so those would be considered the continuing ones or "Cont".
Here is the basic query I have so far:
SELECT DISTINCT
people.people_id,
people.last_name,
people.first_name,
academic.college_attend AS NewORCont,
academic.academic_year,
academic.academic_term,
FROM
academic
INNER JOIN
people ON people.people_id = academic.people_id
INNER JOIN
academiccalendar acc ON acc.academic_year = academic.academic_year
AND acc.academic_term = academic.academic_term
AND acc.true_academic_year = #Academic_year
I'm not sure if this can be done with a CASE statement? I thought of a GROUP BY, but then SQL Server will want me to add all of my columns to the GROUP BY clause, and that ends up negating the purpose of the grouping in the first place.
Just a sample of what I work with for each student:
People ID
Last
First
NeworCont
12345
Soanso
Guy
New
12345
Soanso
Guy
Cont
32345
Person
Nancy
Cont
32345
Person
Nancy
Cont
55555
Smith
John
New
55555
Smith
John
Cont
---------
------
-------
----------
Hopefully this sheds some light on the duplicate record issue I mentioned.
Without sample data its awkward to visualize the problem, and without the expected results specified it's also unclear what you want as the outcome. Perhaps this will assist, it will limit the results to only those who have both 'New' and 'Cont' in a single "true academic year" but the count seems redundant as this (I imagine) will always be 2 (being 1 New term and 1 Cont term)
SELECT
people.people_id
, people.last_name
, people.first_name
, acc.true_academic_year
, count(*) AS count_of
FROM academic
INNER JOIN people ON people.people_id = academic.people_id
INNER JOIN academiccalendar acc ON acc.academic_year = academic.academic_year
AND acc.academic_term = academic.academic_term
AND acc.true_academic_year = #Academic_year
GROUP BY
people.people_id
, people.last_name
, people.first_name
, acc.true_academic_year
HAVING MAX(academic.college_attend) = 'New'
AND MIN(academic.college_attend) = 'Cont'

How to make a query to obtain only results that have N number within a range of values?

I'm trying to extract nutrient data in MS Access 2007 from the USDA food database, freely available at http://www.ars.usda.gov/Services/docs.htm?docid=24912
I need records that have ALL nutrients from NUT_DATA.Nutr_No . Those records have values between '501' and '511' . But I wish to exclude incomplete records that have missing values.
Currently, Baby food banana has all from nutrient 501 to 511, but Baby food Beverage has only 9 of the nutrients listed, and many others are like that.
As a last resort, I guess it would be acceptable to have all records, showing null for missing values, as long as each FOOD_DES.Long_Desc has exactly 11 records, one for each NUT_DATA.Nutr_No OR NUTR_DEF.NutrDesc (which correspond to each other).
SELECT
FOOD_DES.NDB_No, FOOD_DES.FdGrp_Cd, FOOD_DES.Long_Desc, NUT_DATA.Nutr_No, NUTR_DEF.NutrDesc, NUT_DATA.Nutr_Val, WEIGHT.Amount, WEIGHT.Msre_Desc, WEIGHT.Gm_Wgt, [WEIGHT]![Amount] & " " & [WEIGHT]![Msre_Desc] AS msre
FROM
NUTR_DEF inner JOIN ((FOOD_DES INNER JOIN NUT_DATA ON FOOD_DES.NDB_No=NUT_DATA.NDB_No) INNER JOIN WEIGHT ON FOOD_DES.NDB_No=WEIGHT.NDB_No) ON NUTR_DEF.Nutr_No=NUT_DATA.Nutr_No
WHERE
(NUT_DATA.Nutr_No between '501' and '511' ) and ((WEIGHT.Seq)="1") and NUT_DATA.Nutr_Val > '0' and
// this part is me out of ideas trying stuff, but didn't help
EXISTS (SELECT 1
FROM
NUTR_DEF inner JOIN ((FOOD_DES INNER JOIN NUT_DATA ON FOOD_DES.NDB_No=NUT_DATA.NDB_No) INNER JOIN WEIGHT ON FOOD_DES.NDB_No=WEIGHT.NDB_No) ON NUTR_DEF.Nutr_No=NUT_DATA.Nutr_No
WHERE count FOOD_DES.Long_Desc = "11" )
//end wild of experimentation
ORDER BY FOOD_DES.Long_Desc, NUTR_DEF.SR_Order;
This is a sample of the data. I just copied the most important columns. The red is not what I'm looking for because it doesn't have all 11 nutrients. I can paste on the google doc the whole table if someone thinks that would help.
https://docs.google.com/spreadsheets/d/1FghDD59wy2PYlpsqUlYVc3Ulwvy4MMLagpBUYtvLBfI/edit?usp=sharing
As your starting point, identify which food items have values > 0 for all 11 of those nutrients. Check whether this simpler GROUP BY query shows you the correct items:
SELECT ndat.NDB_No
FROM
NUT_DATA AS ndat
INNER JOIN WEIGHT AS wt
ON ndat.NDB_No = wt.NDB_No
WHERE
ndat.Nutr_Val>0
AND ndat.Nutr_No IN('501','502','503','504','505','506','507','508','509','510','511')
AND wt.Seq='1'
GROUP BY ndat.NDB_No
HAVING Count(ndat.Nutr_No)=11;
Note you could use Val(ndat.Nutr_No) Between 501 And 511 as the Nutr_No restriction, which would give you a more concise statement. However, evaluating Val() for every row of the table means that approach would forego the performance benefit of indexed retrieval ... so that version of the query should be noticeably slower.
Save that query and create a new query which joins it to the base tables for the additional data you need from other columns. Or use it as a subquery instead of a named query if you prefer.

Join Queries with parameter filter on second query SSMS

I have two tables: EQUIPMENT and WORKORDERS.
EQUIPMENT returns the count of Equipment against a particular depot by the type of Equipment:
MAINTDEPOT EQUIPCOUNT EQUIPTYPE
1 44 MC
2 20 MC
3 5 MC
1 20 FS
2 3 FS
3 10 FS
...and so on. These counts rarely change unless a new bit of kit is put in.
I need to join a count of WORKORDERS to this table, but the work orders have a COSTCENTRE of either A B or E. This is so that I can generate a percentage of equipment with workorders.
I've joined the tables, but when I add a parameter filter to the WORKORDERS COSTCENTRE column the Count of EQUIPMENT changes, and I need it to stay the same.
I'm guessing I need to use subqueries to ensure that the left subquery remains static whilst the filter only changes the right hand one. Does anyone have any idea how I do this?
Here's my current query:
SELECT E.E_MAINTDEPOT, E.E_EQUIPCOUNT, C.Category, E.MYORDER, E.W_WORKCOUNT,
E.E_NOWO, E.W_HRS, E.E_QA, E.E_EGI, E.E_CLASS,
ISNULL(ROUND(CAST(E.E_NOWO AS Float) /
CAST(E.E_EQUIPCOUNT AS Float) * 100, 2), 100) AS RESULT,
SUBSTRING(E.E_CLASS, 1, 1) AS EM_CLASS
FROM (
SELECT T.E_MAINTDEPOT, COUNT(T.EQUIP_NO) AS E_EQUIPCOUNT,
SUM(C.W_WORKCOUNT) AS W_WORKCOUNT,
COUNT(T.EQUIP_NO) - SUM(C.W_WORKCOUNT) AS E_NOWO, T.MYORDER,
T.E_QA, T.E_EGI, T.E_CLASS, SUM(C.W_HRS) AS W_HRS
FROM EQDType AS T
FULL OUTER JOIN EquipWOCount AS C
ON T.EQUIP_NO = C.EQUIP_NO
GROUP BY T.MYORDER, T.E_MAINTDEPOT, T.E_QA, T.E_EGI, T.E_CLASS, C.W_FUNCTION
) AS E
INNER JOIN EQDCategory AS C
ON E.MYORDER = C.Myorder
ORDER BY E.MYORDER, E.E_MAINTDEPOT
Thank you

Why is this SQL query returning repeated records, when there not repeated in the database?

SELECT *
FROM support_systems,tickets
INNER JOIN user_access ON tickets.support_system_id = user_access.support_system_id
WHERE support_systems.account_id = #session.account_id#
AND user_access.user_access_level >= 1
AND user_access.user_id = #session.user_id#
Any clue why this query would return a record set with repeated records? The results are looking like this:
Priority ID Subject Status
high 1 First Subject open
high 1 First Subject open
low 3 Weeee open
low 3 Weeee open
medium 4 hhhhh closed
medium 4 hhhhh closed
medium 5 neat open
medium 5 neat open
Let me know if you guys need more information, thanks a lot.
You are selecting records from the table support_system but have not specified the join condition. What is the relationship between this table and the others you are interrogating?
You may want something like this
SELECT *
FROM support_systems
INNER JOIN tickets ON
support_systems.support_system_id = tickets.support_system_id
INNER JOIN user_access ON
tickets.support_system_id = user_access.support_system_id
WHERE support_systems.account_id = #session.account_id#
AND user_access.user_access_level >= 1
AND user_access.user_id = #session.user_id#
The problem is this line:
FROM support_systems,tickets
I would remove the tickets from the FROM clause and make it an inner join clause. Right now you have what's called a "cross product": http://en.wikipedia.org/wiki/Cross_product
I would have to say its probably becuase you have an explicite join and a non explicite join which isnt handled in the where which is producing a cartesian...
you have three tables...
but only two tables used in the join... you need a 2nd join... you need to include support_systems in your join somewhere.
probably like
from support_systems a left join user_access b on a.support_systems_id = b.support_systems_id
left join ticket c on c.support_systems_id = b.support_systems_id
then your where would be the same... and it would return based on the correctly joined tables.

SQL: Need to remove duplicate rows in query containing multiple joins

Note that I'm a complete SQL noob and in the process of learning. Based on Google searches (including searching here) I've tried using SELECT DISTINCT and GROUP BY but neither works, likely due to all of my joins (if anyone knows why they won't work exactly, that would be helpful to learn).
I need data from a variety of tables and below is the only way I know to do it (I just know the basics). The query below works fine but shows duplicates. I need to know how to remove those. The only hint I have right now is perhaps a nested SELECT query but based on research I'm not sure how to implement them. Any help at all would be great, thanks!
USE SQL_Contest
go
SELECT
CLT.Description AS ClockType,
CLK.SerialNumber AS JobClockSerial,
SIT.SiteNumber AS JobID,
SIT.[Name] AS JobsiteName,
SIT.Status AS SiteActivityStatus,
DHA.IssuedDate AS DHAIssuedDate, -- Date the clock was assigned to THAT jobsite
CLK.CreatedDate AS CLKCreatedDate, -- Date clock first was assigned to ANY jobsite
SES.ClockVoltage
FROM tb_Clock CLK
INNER JOIN tb_ClockType CLT
ON CLK.TypeID = CLT.ClockTypeID
INNER JOIN tb_DeviceHolderActivity DHA
ON CLK.ClockGUID = DHA.DeviceGUID
INNER JOIN tb_Site SIT
ON SIT.SiteGUID = DHA.HolderGUID
LEFT JOIN tb_Session SES
ON SES.ClockSerialNumber = CLK.SerialNumber
WHERE DHA.ReturnedDate IS NULL
ORDER BY SIT.[Name] ASC
EDIT: I will be reviewing these answers shortly, thank you very much. I'm posting the additional duplicate info per Rob's request:
Everything displays fine until I add:
LEFT JOIN tb_Session SES
ON SES.ClockSerialNumber = CLK.SerialNumber
Which I need. That's when a duplicate appears:
JobClock 2,500248E4,08-107,Brentwood Job,1,2007-05-04 13:36:54.000,2007-05-04 13:47:55.407,3049
JobClock 2,500248E4,08-107,Brentwood Job,1,2007-05-04 13:36:54.000,2007-05-04 13:47:55.407,3049
I want that info to only display once. Essentially this query is to determine all active jobsites that have a clock assigned to them, and that job only has one clock assigned to it, and it's only one jobsite, but it's appearing twice.
EDIT 2: Based on the help you guys provided I was able to determine they actually are NOT duplicates, and each session is independent, that is the only one that happened to have two sessions. So now I'm going to try to figure out how to only pull in information from the latest session.
If everything "works fine" until you add:
LEFT JOIN tb_Session SES
ON SES.ClockSerialNumber = CLK.SerialNumber
Then there must be more than one record in tb_Session for each CLK.SerialNumber.
Run the following query:
SELECT *
FROM tb_Session SES
WHERE ClockSerialNumber = '08-107'
There should be two records returned. You need to decide how to handle this (i.e. Which record do you want to use?), unless both rows from tb_Session contain identical data, in which case, should they?
You could always change your query to:
SELECT
CLT.Description AS ClockType,
CLK.SerialNumber AS JobClockSerial,
SIT.SiteNumber AS JobID,
SIT.[Name] AS JobsiteName,
SIT.Status AS SiteActivityStatus,
DHA.IssuedDate AS DHAIssuedDate, -- Date the clock was assigned to THAT jobsite
CLK.CreatedDate AS CLKCreatedDate, -- Date clock first was assigned to ANY jobsite
SES.ClockVoltage
FROM tb_Clock CLK
INNER JOIN tb_ClockType CLT
ON CLK.TypeID = CLT.ClockTypeID
INNER JOIN tb_DeviceHolderActivity DHA
ON CLK.ClockGUID = DHA.DeviceGUID
INNER JOIN tb_Site SIT
ON SIT.SiteGUID = DHA.HolderGUID
LEFT JOIN
(
SELECT DISTINCT ClockSerialNumber, ClockVoltage
FROM tb_Session
) SES
ON SES.ClockSerialNumber = CLK.SerialNumber
WHERE DHA.ReturnedDate IS NULL
ORDER BY SIT.[Name] ASC
As that should ensure that SES only contains one record for each unique combination of ClockSerialNumber and ClockVoltage
Take this example dataset:
Ingredient
IngredientId IngredientName
============ =========
1 Apple
2 Orange
3 Pear
4 Tomato
Recipe
RecipeId RecipeName
======== ==========
1 Apple Turnover
2 Apple Pie
3 Poached Pears
Recipe_Ingredient
RecipeId IngredientId Quantity
======== ============ ========
1 1 0.25
1 1 1.00
2 1 2.00
3 3 1.00
Note: Why the Apple Turnover has two lots of apple as ingredients, is neither here nor there, it just does.
The following query will return two rows for the "Apple Turnover" recipe, one row for the "Apple Pie" recipe and one row for the "Poached Pears" recipe, because there are two entries in the Recipe_Ingredient table for IngredientId 1. That's just what happens with a join..
SELECT I.IngredientName,
R.RecipeName
FROM Ingredient I
JOIN Recipe_Ingredient RI
ON I.IngredientId = RI.IngredientId
JOIN Recipe R
ON RI.recipeId = R.RecipeId
You could get this to return only one row by changing it to:
SELECT I.IngredientName,
R.RecipeName
FROM Ingredient I
JOIN Recipe_Ingredient RI
ON I.IngredientId = RI.IngredientId
JOIN Recipe R
ON RI.recipeId = R.RecipeId
GROUP BY I.IngredientName, R.RecipeName
Without more specifics regarding your data, it's hard to apply this to your specific scenario, but the walkthrough may help you understand where the "duplicates" are coming from as someone unfamiliar with SQL
The joins are not your problem. From your comments I will infer that what you are calling "duplicates" are not actual duplicates. If all columns values for 2 "duplicates" returned from the query matched, then either SELECT DISTINCT or GROUP BY would definitely eliminate them. So you should be able to find a solution by looking at your column definitions.
My best guess is that you're getting duplicates of for the same date which aren't really duplicates because the time component of the date doesn't match. To eliminate this problem, you can truncate the date fields to the date only using this technique:
DATEADD(DAY, DATEDIFF(DAY, 0, DHA.IssuedDate), 0) AS DHAIssuedDate,
DATEADD(DAY, DATEDIFF(DAY, 0, CLK.CreatedDate), 0) AS CLKCreatedDate,
If that doesn't work you might want to take a look at JobClockSerial: does this column belong in the query results?