Select SQL table content based on content in foreign table - sql

I have a database holding informations about different jobs.
The jobs can either be for internal customers or external customers.
I need to select the rows in the table Job, which points to a record in customer where isInternal is set to true
I've tried to use innerjoins:
select Job.* from Job as Job
INNER JOIN Task as Task
ON Job.JobID = Task.JobID
Inner Join Customer as Customer
ON Task.CustomerID = Customer.CustomerID
But this way i will end up with a lot of duplicates in the job.
I tried to use distinct as well, but i end up with less rows than i actually have.
Can anyone point me in the right direction regarding how to approach this kind of task with sql?
In the end this will be used in a SSIS package, for loading data into a staging layer of a DWH

If you want jobs where any task has an internal customer, you can use exists:
select j.*
from Job
where exists (select 1
from Task t join
Customer c
on t.CustomerID = c.CustomerID
where j.JobID = t.JobID and
c.isInternal = 1
);

Related

SQL Server: Replicate data from multiple tables into one

Is it possible to replicate data from multiple tables into one real time?
For example I have 3 tables:
Table: DriverLocation
Columns: DriverId, Latitude, Longtide, GpsLastUpdate, Speed, ...
Table: Driver
Columns: DriverId, FirstName, LastName, Address, IsActive, ...
Table: Job
Columns: JobId, DriverId, JobName, Pickup, Delivery, ...
I'd like to create a join query between these 3 tables and in real time publish that data into a table called RealTimeDriverInfo.
Example (pseudo code):
Driver(FirstName, LastName)
JOIN DriverLocatoin(Latitude, Longitude)
JOIN Job(JobName)
PUT RESULT IN REAL TIME INTO => RealTimeDriverInfo table
Is that possible?
EDIT: NOTE:
The reason I want the data to be in a table is because I would like to use this c# library to get real time notifications from a table, this libarary only works with tables and unfortunately you cannot join multiple tables only monitor one table, the library is SQLTableDependency
.NET Library to monitor chagnes in a SQL Table (it can only work with tables, NOT views and you can only monitor one table you cannot join multiple tables)
https://github.com/christiandelbianco/monitor-table-change-with-sqltabledependency
This is why I somehow need to join multiple tables into one table
First, a simple view may do what you want:
CREATE v_RealTimeDriverInfo
SELECT D.FirstName, D.LastName, DL.Latitude, DL.Longitude, J.JobName
FROM Driver D JOIN
DriverLocation DL
ON D.DriverID = DL.DriverID JOIN
Job J
ON J.DriverID = D.DriverID;
with indexes on the columns used for JOINing, this would usually be fast enough when querying. This is the "standard" way to do what you want.
If you actually want a separate "virtual" table that is always up-to-date, then you want something akin to a materialized view, which SQL Server supports as indexed views. The idea is that the index is maintained "real-time" as the underlying data changes.
For your example, it would look like:
CREATE v_RealTimeDriverInfo
SELECT D.FirstName, D.LastName, DL.Latitude, DL.Longitude, J.JobName
FROM Driver D JOIN
DriverLocation DL
ON D.DriverID = DL.DriverID JOIN
Job J
ON J.DriverID = D.DriverID;
CREATE UNIQUE CLUSTERED INDEX v_RealTimeDriverInfo_pk ON (D.FirstName, D.LastName, DL.Latitude, DL.Longitude, J.JobName);
Is inserting into a different table necessary? A view would suffice in this instance. I am assuming that JobId is an integer that increments by one with each job. I use CROSS APPLY to pull in the most recent location and job. You'll also have to have some logic to determine whether or not the driver is on an active job.
select a.FirstName FirstName
,a.LastName LastName
,b.Latitude Latitude
,b.Longitude Longitude
,c.JobName JobName
from Driver a
cross apply (
select top 1
*
from DriverLocation b
where b.DriverId = a.DriverId
order by b.GpsLastUpdate desc
) b
cross apply (
select top 1
c.*
from Job c
where c.DriverId = a.DriverId
order by c.JobId desc
) c
Sure you can, just a quick couple of JOINS. But it seems that creating a view would probably serve you a little better.
INSERT INTO RealTimeDriverInfo (FirstName,LastName,Latitude,Longitude,JobName)
SELECT D.FirstName, D.LastName,DL.Latitude, DL.Longitude,J.JobName
FROM Driver D
INNER JOIN DriverLocation DL on D.DriverID = DL.DriverID
INNER JOIN Job J on J.DriverID = D.DriverID

Data Mart - how to handle one to many relation?

I have a following situation that I am not sure how to handle:
There is a table Inovice_Item, Service and ServiceLang. Invoice_item table has FK_Service key (one to one). Service table has FK_Service_Lang key. ServiceLang table has FK_Service key so it makes it many to many relation.
In other words, Invoice_Item can have multiple ServiceLang records, which means that when I make a join query, invoice_item records get duplicated. What are the options to handle such situations?
I would like to have ServiceLang dimension in the cube, but I am not sure how to handle duplicates caused by join.
EDIT
I've made an example:
The queries are as following:
-- One lang for service A, two langs for service B
select * from ServiceLang
-- Two records: A and B
select * from Service
-- Total amount is 20
select * from InvoiceItem
-- Query to populate Fact table
-- Total amount is 30
select *
from InvoiceItem II
inner join Service S on II.FK_Service = S.PK_Service
inner join ServiceLang SL on S.PK_Service = SL.FK_Service
So, if there are two Service_Lang records related to one service than there is a duplicate row meaning that total services amount would be 30 but it should be 20. So, my question is how to handle these situations?
From the description you are mistaken. Each Invoice_Item has one and only one Service and each Service has one and only one Service_Lang. However each Service_Lang has many Service records and each Service has many Invoice_Item records
The relationships are
Invoice (n) <- (1) Service (n) <- (1) Service_Item
Thus the JOIN would be
Select Invoice_Item.*, Service_Lang.WhateverColumnYouWant
From Invoice_Item
Inner Join Service On Service.Key = Invoice_Item.FK_Service_Key
Inner Join Service_Lang On Service_Lang.Key = Service.FK_Service_Lang_Key
Edit: So the Service table does not have a FK_Service_Lang key on it, in which case you can only select one of the possible values for languages associated with the service. You could select the Min, the Max or some derivation based upon your preferred language, some examples...
Select InvoiceItem.*,
Case When Exists (Select 1 From ServiceLang
Where ServiceLang.FK_Service = InvoiceItem.FK_Service
And ServiceLang.Name = 'English')
Then 'English'
Else (Select Max(Name) From ServiceLang
Where ServiceLang.FK_Service = InvoiceItem.FK_Service)
End As ServiceLanguage,
(Select Max(Name) From ServiceLang
Where ServiceLang.FK_Service = InvoiceItem.FK_Service) As MaxLanguage,
(Select Min(Name) From ServiceLang
Where ServiceLang.FK_Service = InvoiceItem.FK_Service) As MinLanguage
From InvoiceItem
I've no idea how big your ServiceLang table is but good practice would be ensure there is a key on the FK_Service column

Confused on the logic of multiple joins on multiple tables

I'm having a little problem with extracting and counting data from my database due to the way the database is setup.
Each case has multiple customers and suppliers with one main supplyer.
The main problems i need to over come are as follows:
To be able to first count the full amount of customers between a period of months driven by when the suppler jonied the company.
Count how many customers have had an "initial contact" as sometimes there will be no initial contact in the database.
I have tried to do this by using multiple joins in a single query but this doesn't seem to return complete data.
I'm so confused with using multiple joins, i understand that they can be executed in any order but i'm unsure what the second join is running on and also is i can legally join unrelated table in the same query or if i need to do a seperate query to do this.
Please find below a recreation of one of my many queries below with a very simplfied picture :
SELECT Count(cc.customercase)
FROM customer cc
LEFT JOIN customer
ON cc.custid = c.custid
LEFT JOIN maincase m
ON m.id = cc.caseid
LEFT JOIN custcontactlog cl
ON cl.caseid = cc.custcaseid
LEFT JOIN supcase sc
ON sc.caseid = m.id
WHERE cl.contactlogtype = 'Initial Contact'
AND sc.primarysupplyer = 1
AND Calctargetdate(sc.joindate) > cl.postdate
AND cl.postdate > sc.joindate
AND c.gender = 'M'
AND sc.joindate BETWEEN CONVERT(DATETIME, '01/01/2012', 103) AND
CONVERT(DATETIME, '31/03/2012', 103)
http://i50.tinypic.com/2qk3pqa.png
I will suggest that as you trying to achieve two things so you should have two separate queries to achieve it.

Retrieve different row from same table

i hava a set of following tables
customer(cus_id,cus_name);
jointAccount(cus_id,acc_number,relationship);
account(acc_number,cus_id)
now i want to create a select statement to list all the jointAccounts,
it should included the both customer name, and relationship.
I have no idea how to retrieve both different user name, is that possible to do this?
Generally speaking, yes. I'm assuming you mean you want to get customer info for both sides of the joint account per your jointAccount table. Not sure what database you're using so this answer is assuming MySQL.
You can join on the same table twice in a single SQL query. I'm assuming you have not yet created your tables, as you have cus_id listed twice in the jointAccount table. Typically these would be something like cus_id1 and cus_id2, which I've used in my sample query below.
Example:
SELECT c1.cus_id AS cust1_id, c1.cus_name AS cust1_name
, c2.cus_id AS cust2_id, c2.cus_name AS cust2_name, j.relationship
FROM customer c1
INNER JOIN jointAccount j
ON c1.cus_id = j.cus_id1
, customer c2
INNER JOIN jointAccount j
ON c2.cus_id = j.cus_id2
I haven't tested this but that's the general idea.
try this query:
SELECT * FROM jointAccount a LEFT JOIN customer c ON a.cus_id = c.cus_id;
just replace the * with the name of the columns you need.

New to SQL. Query SQL database using info across three tables

This is using phpMyAdmin.
I need to find the contact information for Subscribers who have pending Orders on November 15th. Their contact information is stored in a table called Subscribers, and the primary key is UID (User ID). The Subscriptions Table has a primary key called SID (Subscriptions ID). The Subscriptions table also stores the UID for each Subscription. However, the Orders table is where the Date is stored, and this table stores the SID but not the UID, so I can't directly JOIN Orders with Subscribers.
I have to JOIN Orders with Subscriptions on SID where the Orders Date is 11-15-10, and then I have to JOIN the resulting table with the Subscribers table on UID.
I'm currently trying this:
SELECT * FROM Subscribers
RIGHT JOIN (Orders a, Subscriptions b, Subscribers c)
ON (a.SID = b.SID AND b.UID = c.UID)
WHERE a.Date = '2010-11-01'
This is causing a massive lag followed by Gateway Timeout.
This is a classic case of knowing what to do, but not knowing how to do it. Any help would be greatly appreciated. Thanks!
You could try this:
SELECT
scrb.*
FROM
Subscribers scrb
WHERE
scrb.UID in (
SELECT DISTINCT
scrp.UID
FROM
Subscriptions scrp
INNER JOIN Orders ordr ON
ordr.SID = scrp.SID
WHERE
ordr.Date = STR_TO_DATE('2010-11-01')
)
Not sure if you're going to have a big performance improvement though... Maybe your tables miss a better indexing strategy...?
In fact, you should try executing just the inner query (SELECT DISTINCT scrp.UID...) first... If it is too slow, I would guess your problem is on the Orders.Date field - a full scan over that table probably has a high performance cost.
Why do you join Subscribers to Subscribers?
SELECT * FROM Subscribers ... JOIN ... Subscribers c)
Given the limited amount we know about your schema, it seems like you'd do better with an INNER JOIN, which will filter records for you, and #seriyPS is right about the redundant Subscribers table - currently, this query as written is performing a CROSS JOIN, joining all Subscribers to every result of Subscriber joined to Subscription joined to Order...
Is there a reason why this won't work?
SELECT a.*
FROM Subscribers a
INNER JOIN Subscriptions b ON a.UID = b.UID
INNER JOIN Orders c ON b.SID = c.SID
WHERE c.Date = '2010-11-01'