SQL do i use Subquery or what? - sql

ok so i have 3 tables :
i need to return which cars have received fixing more than 150 times
thanks in advance :)

The query would look something like this:
SELECT T1.Car, COUNT(t3.*)
FROM
Table1 T1
JOIN Table2 T2 ON T1.Id = T2.table2ID
JOIN Table3 T3 on T3.Id = T2.table3Id
GROUP BY T1.Car
Order by T1.Car
Yes you can also do a subquery so you would be selecting from table 1 and instead of the count, you would do a subquery with table 2 and table 2 joined back to table 1.
But you can use join. I think they will be more efficient here.

First of all, you are using a relational database, Secondly, you happen to have 2 Dimension Tables and 1 FACT table
The dimension tables make searching the FACT table easier, though this only is valid if you need a characteristic from those DIMENSION tables that you cannot get in the FACT table (such as [type] of fixes).
Since you want the raw results of Cars and their number of repairs, use a GROUP BY with a HAVING Clause in your query. Remember that the HAVING clause is still a PREDICATE, so use proper SARGS.
SELECT CAR_ID, COUNT(*) --or COUNT(CAR_ID), it really does not matter
FROM FACT_TABLE
GROUP BY CAR_ID
HAVING COUNT(FIX_ID) >= 150
The GROUP BY smashes the table by CAR_ID and counts the rows combined in the COUNT function while the HAVING, begin a predicate, filters the results of the aggregate functions.

Nope, just use two inner joins. And then group by car and count the number of lines.

Related

How to use postgres group by statement when joining tables

I have a very simple query that i want to execute in postgres.
table1 has one to may relation to tables2 and 3.
pseudo query is as follows
select * from table1
left join table2 ON table2.table1_id = table1.id
left join table3 ON table3.table1_id = table1.id
group by table1.id
This gives me an error:
"column "table2.id" must appear in the GROUP BY clause or be used in an aggregate function",
same for table3.id
What is the point of Group by, if it forces me to add the id's of all the tables into group by, thus defeating the group by purpose( all ids are unique and no grouping occurs )
The purpose of the group by is to summarize data. There is one row in the result set for every combination of keys in the group by.
The columns in the result set are either keys in the group by or are aggregations. There is one exception to this rule, involving grouping by unique or primary keys on a table and using other columns.
The use of select * with group by is simply not a correct use of aggregation in SQL.
You seem to be misunderstanding the purpose of this construct. It is possible that you really mean order by -- that will order the result set by the the order by keys without changing (i.e. summarizing) the number of rows.

SQL: Except Join on multiple queries

Getting a bit stuck trying to build this query. (SQL SERVER)
I'm trying to join two tables on similar rows, but then stack the unique rows from both table 1 and table 2 on the result set. I was first shooting for a full outer join, but it leaves my key fields blank when the data comes from only one of the tables.
Example: Full Outer Join
Here's what I would like for the query to be able to do:
Essentially, I would like to have a result table where the key fields (Part and Operation) are all returned in two columns (so like a union), but the Estimated and Actual Rate columns returned side by side where there is a matching row between table 1 and table 2.
I've also been trying to inner join the two tables to make a subquery, then using that inner join for except clause on each of the tables, then stacking the original inner join with the two except unions.
Current Attempt: One Join, Two Excepts, Two Unions
UPDATE: I got the current attempt to return values! It's a bit complicated though, Appreciate any advice or feedback though! Great answers below thanks, I will need to do some comparisons
Thanks
SELECT ISNULL(t1.part,t2.part) AS Part,
ISNULL(t1.operation,t2.operation) AS Operation,
ISNULL('Estimated Rate',0) AS 'Estimated Rate',
ISNULL('Actual Rate',0) AS 'Actual Rate'
FROM table1 t1
FULL OUTER JOIN table2 t2
ON t1.part = t2.part
AND t1.operation = t2.operation
I would do this as a union all and group by:
select part, operation,
sum(estimatedrate) as estimatedrate, sum(actualrate) as actualrate
from ((select part, operation, estimatedrate, 0 as actualrate
from table1
) union all
(select part, operation, 0 as estimatedrate, 0 actualrate
from table1
)
) er
group by part, operation;

Getting way more results than expected in SQL left join query

My code is such:
SELECT COUNT(*)
FROM earned_dollars a
LEFT JOIN product_reference b ON a.product_code = b.product_code
WHERE a.activity_year = '2015'
I'm trying to match two tables based on their product codes. I would expect the same number of results back from this as total records in table a (with a year of 2015). But for some reason I'm getting close to 3 million.
Table a has about 40,000,000 records and table b has 2000. When I run this statement without the join I get 2,500,000 results, so I would expect this even with the left join, but somehow I'm getting 300,000,000. Any ideas? I even refered to the diagram in this post.
it means either your left join is using only part of foreign key, which causes row multiplication, or there are simply duplicate rows in the joined table.
use COUNT(DISTINCT a.product_code)
What is the question are are trying to answer with the tsql?
instead of select count(*) try select a.product_code, b.product_code. That will show you which records match and which don't.
Should also add a where b.product_code is not null. That should exclude the records that don't match.
b is the parent table and a is the child table? try a right join instead.
Or use the table's unique identifier, i.e.
SELECT COUNT(a.earned_dollars_id)
Not sure what your datamodel looks like and how it is structured, but i'm guessing you only care about earned_dollars?
SELECT COUNT(*)
FROM earned_dollars a
WHERE a.activity_year = '2015'
and exists (select 1 from product_reference b ON a.product_code = b.product_code)

How do I write an SQL query to identify duplicate values in a specific field?

This is the table I'm working with:
I would like to identify only the ReviewIDs that have duplicate deduction IDs for different parameters.
For example, in the image above, ReviewID 114 has two different parameter IDs, but both records have the same deduction ID.
For my purposes, this record (ReviewID 114) has an error. There should not be two or more unique parameter IDs that have the same deduction ID for a single ReviewID.
I would like write a query to identify these types of records, but my SQL skills aren't there yet. Help?
Thanks!
Update 1: I'm using TSQL (SQL Server 2008) if that helps
Update 2: The output that I'm looking for would be the same as the image above, minus any records that do not match the criteria I've described.
Cheers!
SELECT * FROM table t1 INNER JOIN (
SELECT review_id, deduction_id FROM table
GROUP BY review_id, deduction_id
HAVING COUNT(parameter_id) > 1
) t2 ON t1.review_id = t2.review_id AND t1.deduction_id = t2.deduction_id;
http://www.sqlfiddle.com/#!3/d858f/3
If it is possible to have exact duplicates and that is ok, you can modify the HAVING clause to COUNT(DISTINCT parameter_id).
Select ReviewID, deduction_ID from Table
Group By ReviewID, deduction_ID
Having count(ReviewID) > 1
http://www.sqlfiddle.com/#!3/6e113/3 has an example
If I understand the criteria: For each combination of ReviewID and deduction_id you can have only one parameter_id and you want a query that produces a result without the ReviewIDs that break those rules (rather than identifying those rows that do). This will do that:
;WITH review_errors AS (
SELECT ReviewID
FROM test
GROUP BY ReviewID,deduction_ID
HAVING COUNT(DISTINCT parameter_id) > 1
)
SELECT t.*
FROM test t
LEFT JOIN review_errors r
ON t.ReviewID = r.ReviewID
WHERE r.ReviewID IS NULL
To explain: review_errors is a common table expression (think of it as a named sub-query that doesn't clutter up the main query). It selects the ReviewIDs that break the criteria. When you left join on it, it selects all rows from the left table regardless of whether they match the right table and only the rows from the right table that match the left table. Rows that do not match will have nulls in the columns for the right-hand table. By specifying WHERE r.ReviewID IS NULL you eliminate the rows from the left hand table that match the right hand table.
SQL Fiddle

Problem with sql query

I'm using MySQL and I'm trying to construct a query to do the following:
I have:
Table1 [ID,...]
Table2 [ID, tID, start_date, end_date,...]
What I want from my query is:
Select all entires from Table2 Where Table1.ID=Table2.tID
**where at least one** end_date<today.
The way I have it working right now is that if Table 2 contains (for example) 5 entries but only 1 of them is end_date< today then that's the only entry that will be returned, whereas I would like to have the other (expired) ones returned as well. I have the actual query and all the joins working well, I just can't figure out the ** part of it.
Any help would be great!
Thank you!
SELECT * FROM Table2
WHERE tID IN
(SELECT Table2.tID FROM Table1
INNER JOIN Table2 ON Table1.ID = Table2.tID
WHERE Table2.end_date < NOW
)
The subquery will select all tId's that match your where clause. The main query will use this subquery to filter the entries in table 2.
Note: the use of inner join will filter all rows from table 1 with no matching entry in table 2. This is no problem; these entries wouldn't have matched the where clause anyway.
Maybe, just maybe, you could create a sub-query to join with your actual tables and in this subquery you use a count() which can be used later on you where clause.