Select distinct record with join count records - sql

I have two tables: Company and Contact, with a relationship of one-to-many.
I have another table Track which identifies some of the companies as parent companies to other companies.
I want to write a SQL query that selects the parent companies from Track and the amount of contacts that each parent has.
SELECT Track.ParentId, Count(Contact.companyId)
FROM Track
INNER JOIN Contact
ON Track.ParentId = Contact.companyId
GROUP BY Track.ParentId
however The result holds less records than when I run the following query:
SELECT DISTINCT Track.ParentId
FROM Track
I tried the first query with an added DISTINCT and it returned the same results (less then what it was meant to).

You're performing an INNER JOIN with the Contact table, which means that any rows from the first table (Track in this case) with no matches to the JOINed table will not show up in your results. Try using a LEFT OUTER JOIN instead.
The COUNT with Contact.companyId will only count rows where there is a match (Contact.companyId is not NULL). Since you're counting contacts that's fine as they will count as 0. If you were trying to count some other set of data and tried to do a COUNT on a specific column (rather than COUNT(*)) then any NULL values in that column would not count towards your total, which might or might not be what you want.

I used an INNER JOIN which returns only records that are identical in both tables.
To return all records from Track table, and records that match in the Contact table, I need to use LEFT JOIN.

Related

Outputting the data from several sql tables without having a common value

I have a select query which combined several tables. PRODUCTION_ORDER_RESULTS, PRODUCTION_ORDERS and SERVICE_GUARANTY_NEW have common value however STOCKS table does not.
SELECT PR_ORDERS.ARRIVED_CITY,
PR_ORDERS.MONTAJ_DATE,
PR_ORDER_RESULT.TRANSFER_DATE,
PR_ORDERS.P_ORDER_ID,
PR_ORDER_RESULT.P_ORDER_ID,
SG.SALE_CONSUMER_ID,
SG.IS_SERI_SONU,
S.BRAND_ID,
S.PROPERTY
FROM workcube_test_1.PRODUCTION_ORDER_RESULTS PR_ORDER_RESULT,
workcube_test_1.PRODUCTION_ORDERS PR_ORDERS,
workcube_test_1.SERVICE_GUARANTY_NEW SG,
workcube_test_1.STOCKS S
WHERE PR_ORDER_RESULT.P_ORDER_ID = PR_ORDERS.P_ORDER_ID
AND PR_ORDER_RESULT.PR_ORDER_ID = SG.PROCESS_ID
when I run the query, it shows the output as below.
The problem here is there are four data rows returned from PRODUCTION_ORDER_RESULTS, PRODUCTION_ORDERS, SERVICE_GUARANTY_NEW and once I have added the STOCKS table, arrived_city, montaj_date, transfer_date columns are side by side with STOCKS table's rows, but the columns value should be null, not filled with data.
The way I tried is UNION of STOCKS table, however unioned table values are ignored, can not use them in html blocks.
there needs to be at least one more join condition among tables where there's for STOCKS table, I think there might exist such a column STOCK_ID within a table such as PRODUCTION_ORDER_RESULTS in order to join with STOCKS table. I think this should be the reason for multiple returning rows. If there's no common column, then the returning data will be produced as many as the number of records within STOCK table due to existing CROSS JOIN logic within the current query. So rearrange your query as
SELECT PR_ORDERS.ARRIVED_CITY,
PR_ORDERS.MONTAJ_DATE,
PR_ORDER_RESULT.TRANSFER_DATE,
PR_ORDERS.P_ORDER_ID,
PR_ORDER_RESULT.P_ORDER_ID,
SG.SALE_CONSUMER_ID,
SG.IS_SERI_SONU,
S.BRAND_ID,
S.PROPERTY
FROM workcube_test_1.PRODUCTION_ORDER_RESULTS PR_ORDER_RESULT
JOIN workcube_test_1.PRODUCTION_ORDERS PR_ORDERS
ON PR_ORDER_RESULT.P_ORDER_ID = PR_ORDERS.P_ORDER_ID
JOIN workcube_test_1.SERVICE_GUARANTY_NEW SG
ON PR_ORDER_RESULT.PR_ORDER_ID = SG.PROCESS_ID
JOIN workcube_test_1.STOCKS S
ON PR_ORDER_RESULT.STOCK_ID = S.ID

Why do repeated values appear in SQL results

I'm with a doubt about joins. For example, using an example database dvdrental, this query:
SELECT customer.customer_id
, first_name
, last_name
FROM customer
INNER JOIN payment ON Customer.customer_id = Payment.customer_id
Some records appear repeated, for example, it appears 3 times "342 Harold Martino" like:
342 Harold Martino
342 Harold Martino
342 Harold Martino
Do you know why it appears repeated records like in this example that appears the same Record 3 times? This repetition means that there are 3 records in the payment table where customer_id = 342? But this query "select * from payment where customer_id = 342" returns 32 records. So I'm not understanding properly how the join works.
There are many resources around this, so to be short your expression says this in plain english:
Get all the records from the customer table
Then for each of those records, get every payment record that has the same value in the customer_Id field.
return a single row for each payment record that duplicates all the fields from the customer record for each row in the payment record.
Finally, only return 3 columns:
the customer_id column from the customer table
the first_name column that is in one of the customer or payment table
the last_name column that is in one of the customer or payment table
Note that we didn't bring back any columns from the payment table... (I assume first_name and last_name are fields in the customer table...)
Keep in mind, a CROSS JOIN (or a FULL OUTER JOIN) is a join that says take all fields from the left side and create a single row combination that is multiplied by the right side, so for every row on the left, return a combination of the left row with every row on the right. So the number of rows returned in a CROSS JOIN is the number of rows in the current table, multiplied by the number of rows in the joined table.
In your query, an INNER JOIN or LEFT INNER JOIN will produce a recordset that includes all the fields from the current table structure and will include fields from the joined table as well.
the implicit LEFT component specifies that only records that match the existing table structure should be returned, so in this case only Payment records that match a customer_id in the currently not filtered customer table will be returned.
The number of resulting rows is the number of rows in the joined table that have a match in the current table.
If instead you want to query:
Select all the customers that have payments
then you can use a join, but you should also use a DISTINCT clause, to only return the unique records:
SELECT DISTINCT customer.customer_id
, first_name
, last_name
FROM customer
INNER JOIN payment ON Customer.customer_id = Payment.customer_id
An alternative way to do this is to use a sub-query instead of a join:
SELECT customer_id
, first_name
, last_name
FROM customer
WHERE EXISTS (SELECT customer_id FROM payment WHERE payment.customer_id = customer.customer_id)
The rules on when to use one style of query over the other are pretty convoluted and very dependant on the number of rows in each table, the types of indexes that are available and even the type or version of the RDBMS you are operating within.
To optimise, run both queries, compare the results and timing and use the one that fits your database better. If later performance becomes an issue, try the other one :)
Select the Customer_id field

Record with latest date, where date comes from a joined table

I have tried every answer that I have found to finding the last record, and I have failed in getting a successful result. I currently have a query that lists active trailers. I am needing it to only show a single row for each trailer entry, where that row is based on a date in a joined table.
I have tables
trailer, company, equipment_group, movement, stop
In order to connect trailer to stop (which is where the date is), i have to join it to equipment group, which joins to movement, which then joins to stop.
I have tried using MAX and GROUP BY, and PARTITION BY, both of which error out.
I have tried many solutions here, as well as these
https://thoughtbot.com/blog/ordering-within-a-sql-group-by-clause
https://www.geeksengine.com/article/get-single-record-from-duplicates.html
It seems that all of these solutions have the date in the same table as the thing that they want to group by, which I do not.
SELECT
trailer.*
company.name,
equipment_group.currentmovement_id,
equipment_group.company_id,
movement.dest_stop_id, stop.location_id,
stop.*
FROM trailer
LEFT OUTER JOIN company ON (company.id = trailer.company_id)
LEFT OUTER JOIN equipment_group ON (equipment_group.id =
trailer.currenteqpgrpid)
LEFT OUTER JOIN movement ON (movement.id =
equipment_group.currentmovement_id)
LEFT OUTER JOIN stop ON (stop.id = movement.dest_stop_id)
WHERE trailer.is_active = 'A'
Using MAX and GROUP BY gives error "invalid in the select list... not contained in...aggregate function"
Welllllll, I never did end up figuring that out, but if I joined movements on to equipment group by two conditions, all is well. Each extra record was created by each company id.... company id is in EVERY table.

getting duplicates when joining tables

I have two tables that I want to join. Table1 has sales order, but it doesn’t have the name of the sales person. It only has employee ID. I have table2, that has the names of employees, and employeeID is common between the two tables. Normally I would use an inner join to get the name of the sales person from table2. The problem is that on table2, there are multiple entries for each employee. If they changed manager, or changed roles within the company, or perhaps went on FMLA, it creates a new row. Therefore, when I join the tables, it creates duplicates because of the multiple entries in table2. A sale shows 3 or 4 times in my results.
Select
a.state_name
,order_number
,a.employeeID
,b.Sales_Rep_Name
,a.order_date
from
table1 as A
Inner join table2 as B
On a.employeeid = b.employeeID
where
b.monthperiod = 'November' <-- If I remove this one it adds duplicates
Is there a way to not get these duplicates? I tried distinct but didn’t work. Probably because the rows have at least one column different. I was able to eliminate the duplicates when I added a where clause asking for last month on table 2, but I am in a situation where I need all months, not just one. I have to manually change the month in order to get the full year.
Any help would be appreciated. Thanks
Use a subquery to get list of distinct employee records and then query the sales table

Inner join sql statement

I have two tables, Invoices and members, connected by PK/FK relationship through the field InvoiceNum. I have created the following sql and it works fine, and pulls 44 records as expected.
SELECT
INVOICES.InvoiceNum,
INVOICES.GroupNum,
INVOICES.DivisionNum,
INVOICES.DateBillFrom,
INVOICES.DateBillTo
FROM INVOICES
INNER JOIN MEMBERS ON INVOICES.InvoiceNum = MEMBERS.InvoiceNum
WHERE MEMBERS.MemberNum = '20032526000'
Now, I want to replace INVOICES.GroupNum and INVOICES.DivisionNum in the above query with GroupName and DivisionName. These values are present in the Groups and Divisions tables which also have the corresponding Group_num and Division_num fields. I have created the following sql. The problem is that it now pulls 528 records instead of 44!
SELECT
INVOICES.InvoiceNum,
INVOICES.DateBillFrom,
INVOICES.DateBillTo,
DIVISIONS.DIVISION_NAME,
GROUPS.GROUP_NAME
FROM INVOICES
INNER JOIN MEMBERS ON INVOICES.InvoiceNum = MEMBERS.InvoiceNum
INNER JOIN GROUPS ON INVOICES.GroupNum = GROUPS.Group_Num
INNER JOIN DIVISIONS ON INVOICES.DivisionNum = DIVISIONS.Division_Num
WHERE MEMBERS.MemberNum = '20032526000'
Any help is greatly appreciated.
You have at least one relation between your tables which is missing in your query. It gives you extra records. Find all common fields. Say, are divisions related to groups?
The statement is fine, as far as the SQL syntax goes.
But the question you have to ask yourself (and answer it):
How many rows in Groups do you get for that given GroupNum?
Ditto for Divisions - how many rows exist for that DivisionNum?
It would appear that those numbers aren't unique - multiple rows exist for each number - therefore you get multiple rows returned