Issue with joins in a SQL query - sql

SELECT
c.ConfigurationID AS RealflowID, c.companyname,
c.companyphone, c.ContactEmail, COUNT(k.caseid)
FROM
dbo.Configuration c
INNER JOIN
dbo.cases k ON k.SiteID = c.ConfigurationId
WHERE
EXISTS (SELECT * FROM dbo.RepairEstimates
WHERE caseid = k.caseid)
AND c.AccountStatus = 'Active'
AND c.domainid = 46
GROUP BY
c.configurationid,c.companyname, c.companyphone, c.ContactEmail
I have this query - I am using the configuration table to get the siteid of the cases in the cases table. And if the case exists in the repair estimates table pull the company details listed and get a count of how many cases are in the repair estimator table for that siteid.
I hope that is clear enough of a description.
But the issue here is the count is not correct with the data that is being pulled. Is there something I could do differently? Different join? Remove the exists add another join? I am not sure I have tried many different things.

Realized I was using the wrong table. The query was correct.

Related

Determine datatypes of columns - SQL selection

Is it possible to determine the type of data of each column after a SQL selection, based on received results? I know it is possible though information_schema.columns, but the data I receive comes from multiple tables and is joint together and the data is renamed. Besides that, I'm not able to see or use this query or execute other queries myself.
My job is to store this received data in another table, but without knowing beforehand what I will receive. I'm obviously able to check for example if a certain column contains numbers or text, but not if it is originally stored as a TINYINT(1) or a BIGINT(128). How to approach this? To clarify, it is alright if the data-types of the columns of the source and destination aren't entirely the same, but I don't want to reserve too much space beforehand (or too less for that matter).
As I'm typing, I realize I'm formulation the question wrong. What would be the best approach to handle described situation? I thought about altering tables on the run (e.g. increasing size if needed), but that seems a bit, well, wrong and not the proper way.
Thanks
Can you issue the following query about your new table after you create it?
SELECT *
INTO JoinedQueryResults
FROM TableA AS A
INNER JOIN TableB AS B ON A.ID = B.ID
SELECT *
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'JoinedQueryResults'
Is the query too big to run before knowing how big the results will be? Get a idea of how many rows it may return, but the trick with queries with joins is to group on the columns you are joining on, to help your estimate return more quickly. Here's of an example of just returning a row count from the query above which would have created the JoinedQueryResults table above.
SELECT SUM(A.NumRows * B.NumRows)
FROM (SELECT ID, COUNT(*) AS NumRows
FROM TableA
GROUP BY ID) AS A
INNER JOIN (SELECT ID, COUNT(*) AS NumRows
FROM TableB
GROUP BY ID) AS B ON A.ID = B.ID
The query above will run faster if all you need is a record count to help you estimate a size.
Also try instantiating a table for your results with a query like this.
SELECT TOP 0 *
INTO JoinedQueryResults
FROM TableA AS A
INNER JOIN TableB AS B ON A.ID = B.ID

Specifying SELECT, then joining with another table

I just hit a wall with my SQL query fetching data from my MS SQL Server.
To simplify, say i have one table for sales, and one table for customers. They each have a corresponding userId which i can use to join the tables.
I wish to first SELECT from the sales table where say price is equal to 10, and then join it on the userId, in order to get access to the name and address etc. from the customer table.
In which order should i structure the query? Do i need some sort of subquery or what do i do?
I have tried something like this
SELECT *
FROM Sales
WHERE price = 10
INNER JOIN Customers
ON Sales.userId = Customers.userId;
Needless to say this is very simplified and not my database schema, yet it explains my problem simply.
Any suggestions ? I am at a loss here.
A SELECT has a certain order of its components
In the simple form this is:
What do I select: column list
From where: table name and joined tables
Are there filters: WHERE
How to sort: ORDER BY
So: most likely it was enough to change your statement to
SELECT *
FROM Sales
INNER JOIN Customers ON Sales.userId = Customers.userId
WHERE price = 10;
The WHERE clause must follow the joins:
SELECT * FROM Sales
INNER JOIN Customers
ON Sales.userId = Customers.userId
WHERE price = 10
This is simply the way SQL syntax works. You seem to be trying to put the clauses in the order that you think they should be applied, but SQL is a declarative languages, not a procedural one - you are defining what you want to occur, not how it will be done.
You could also write the same thing like this:
SELECT * FROM (
SELECT * FROM Sales WHERE price = 10
) AS filteredSales
INNER JOIN Customers
ON filteredSales.userId = Customers.userId
This may seem like it indicates a different order for the operations to occur, but it is logically identical to the first query, and in either case, the database engine may determine to do the join and filtering operations in either order, as long as the result is identical.
Sounds fine to me, did you run the query and check?
SELECT s.*, c.*
FROM Sales s
INNER JOIN Customers c
ON s.userId = c.userId;
WHERE s.price = 10

Getting way more results than expected in SQL left join query

My code is such:
SELECT COUNT(*)
FROM earned_dollars a
LEFT JOIN product_reference b ON a.product_code = b.product_code
WHERE a.activity_year = '2015'
I'm trying to match two tables based on their product codes. I would expect the same number of results back from this as total records in table a (with a year of 2015). But for some reason I'm getting close to 3 million.
Table a has about 40,000,000 records and table b has 2000. When I run this statement without the join I get 2,500,000 results, so I would expect this even with the left join, but somehow I'm getting 300,000,000. Any ideas? I even refered to the diagram in this post.
it means either your left join is using only part of foreign key, which causes row multiplication, or there are simply duplicate rows in the joined table.
use COUNT(DISTINCT a.product_code)
What is the question are are trying to answer with the tsql?
instead of select count(*) try select a.product_code, b.product_code. That will show you which records match and which don't.
Should also add a where b.product_code is not null. That should exclude the records that don't match.
b is the parent table and a is the child table? try a right join instead.
Or use the table's unique identifier, i.e.
SELECT COUNT(a.earned_dollars_id)
Not sure what your datamodel looks like and how it is structured, but i'm guessing you only care about earned_dollars?
SELECT COUNT(*)
FROM earned_dollars a
WHERE a.activity_year = '2015'
and exists (select 1 from product_reference b ON a.product_code = b.product_code)

SQL - Conditional Left Join Count Performance is Slow

I am using SQL Server 2012.
I have three tables. Builders, Addresses and BuilderAddresses.
I have the following query which is used to give me my total count of records during paging:
SELECT COUNT(*) FROM Builders
LEFT JOIN Addresses ON Addresses.AddressId IN
(SELECT AddressId FROM BuilderAddresses WHERE BuilderId = Builders.BuilderId AND IsPrimary = 1)
WHERE Builders.[Email] LIKE '%TEST'%
ORDER BY Builders.[Name]
This query is particularly slow when records in the table approach 100k+. Does any one have any suggestions on how to get this query to execute faster??
On a table with 120K records, it take 452ms to get the count. When it comes to returning the records used in the paging, say 100 rows, it takes 11ms. I would really like to improve this if I can.
If I need to add greater detail, please let me know and I will edit the question.
The ORDER BY is not necessary for the COUNT, and you can remove that IN validation by joining with BuilderAdresses directly.
Try something like this:
SELECT COUNT(*)
FROM Builders b
LEFT JOIN BuilderAddresses ba ON ba.BuilderId = b.BuilderId AND isPrimary = 1
LEFT JOIN Addresses a ON a.AddressId = ba.AddressId
WHERE Builders.[Email] LIKE '%TEST' %
The problem is most likely to be with using IN as part of the join predicate. What it looks like you need to do is first join the junction table BuilderAddresses and then join Addresses, so something like this
SELECT COUNT(*) FROM Builders
JOIN BuilderAddresses ON BuilderAddresses.BuilderId = Builders.BuilderId AND isPrimary = 1
JOIN Addresses ON Addresses.AddressId = BuilderAddresses.AddressId
WHERE Builders.[Email] LIKE '%TEST%'
ORDER BY Builders.[Name]

SQL JOIN limit results to rows where specific value does not exist

I am joining two tables using SQL. I'm joining a table which contains charter flight information and a table which contains the crew assigned. In my results, I only want to display the rows that only have a value of "Pilot" in the the crew table and not "Copilot" or both.
SELECT * FROM TABLE_A JOIN TABLE_B ON (TABLE_A.Value = TABLE_B.Value) WHERE TABLE_A.OtherValue = 'Pilot'
This is off the top of my head, so some syntax may be off. The main point is the WHERE clause. You can specify the value that you are looking for in the column (in your case you are looking for Pilot).
EDIT: To prevent a value you can do something like WHERE TABLE.VALUE != 'Copilot' != may need to be written as <> depending on the what SQL it is.
EDIT2: My SQL-Server is throwing a hissy and not connecting, so this is also entirely off the top of my head and I think it's a bit of a hack-job, but I think it'll do the job. :)
SELECT [CHARTER].*, COUNT(*) as Tally FROM [CHARTER] JOIN [CREW] ON ([CHARTER].[CHAR_TRIP] = [CREW].[CHAR_TRIP]) WHERE [CREW].[CREW_JOB] = 'PILOT' OR [CREW].[CREW_JOB] = 'COPILOT' GROUP BY [CHARTER].* HAVING Tally = 1
This assume that all flights have a pilot, but not all flights have a co-pilot. To get the exact display you want, you might have to use it as a sub-query (to remove the Tally column).
SELECT *
FROM charter ch
JOIN crew cr ON ch.char_trip = cr.char_trip
WHERE NOT EXISTS(SELECT *
FROM crew cr2
WHERE cr2.char_trip = ch.char_trip
AND cr2.crew_job != 'PILOT')
I think that should do the trick. The join to the crew table in line 3 is optional, and only if you need results from that table. The NOT EXISTS anti-join is what evaluates all crew for a given trip and checks for any that are not pilots.
You should really help us out here with the schema for us to provide you with a decent query. I think the most important thing here is how do you determine who is a pilot and/or copilot and how do you relate each person to the flight.
I think something like this might help:
SELECT * FROM Charter C
INNER JOIN Crew ON (Charter.CHAR_TRIP = Crew.CHAR_TRIP)
WHERE Crew.Crew_Job = 'PILOT' AND (SELECT COUNT(*) FROM Charter
INNER JOIN Crew ON (Charter.CHAR_TRIP = Crew.CHAR_TRIP)
WHERE Crew.Crew_Job = 'CoPilot'
AND Charter.Chart_Trip = C.ChartTrip) = 0
Although this might not be the cleanest solution.. it should do the work.