Suggestion needed for this Hive Query - hive

I have a select statement with 5 ID columns. I need to lookup and select the corresponding customer names from a Customer master table that stores Ids/names and come up with a Customer report. The tables columns are as below:
origCustomerID,Tier1PartnerID,Tier2PartnerID,DistributorId,EndCustomerID,productId,OrderTotal,OrderDate
The first 5 columns are ID columns that match CustID column in the Customers table. Note that NOT all of these columns will contain a value for a given record at all times, i.e. they could be null at times. Given the current constraints in hiveQL, I can only think of the following way, but this takes up a lot of time and is not the best possible way. Could you please suggest any improvements to this?
Select origCustomerID,a.name,Tier1PartnerID,b.name,Tier2PartnerID,
c.name,DistributorId,d.name,EndCustomerID,e.name,productId,OrderTotal,OrderDate
From Orders O
LEFT OUTER JOIN customers a on o.origCustomerID = a.custid
LEFT OUTER JOIN customers b on o.Tier1PartnerID = a.custid
LEFT OUTER JOIN customers c on o.Tier2PartnerID = a.custid
LEFT OUTER JOIN customers d on o.DistributorId = a.custid
LEFT OUTER JOIN customers e on o.EndCustomerID = a.custid

If the id values are always either customer ids or NULL (i.e. in the case they are not NULL you are sure they are customer ids and not something else) and each record in the Orders table matches at most one customer (i.e. every record has at most one id in those five columns; or possible the same id several times), you could perhaps use COALESCE in your matching expression.
I can't test this at the moment, but this should join the records using the first non-NULL id from the Orders table.
SELECT [stuff]
FROM Orders O
LEFT OUTER JOIN customers a
ON COALESCE(o.origCustomerID,
o.Tier1PartnerID,
o.Tier2PartnerID,
o.DistributorId,
o.EndCustomerID) = a.custid
Hope that helps.

Related

SQL Join between two tables excluding some fields

I have two tables Customer and Beneficiary, the relation between them is ManyToMany,
the generated table customers_beneficiaries contains the Id of Beneficiary and the Id of Customer
i want to get the list of customers with a given beneficiary_id
SELECT * from customer c
Full OUTER JOIN customers_beneficiaries cb
ON c.id= cb.customer_id
WHERE cb.beneficiary_id=8;
But the result iam getting contains the two fields of customers_beneficiaries table (customer_id && beneficiary_id)
How can i exclude them from the result
Thank you .
Try this:(In case you can change id column name in customer table to customer_id)
SELECT c.* from customer c
Full OUTER JOIN customers_beneficiaries cb
USING(customer_id)
WHERE cb.beneficiary_id=8;
USING Clause is like ON Clause which takes list of columns on which joining of table has to be done but those columns have to exist in both tables. The columns used in join operation appears only once in output.

2 Tables - one customer, one transactions. How to handle a customer with no transaction?

I have 2 tables-one customers, one transactions. One customer does not have any transactions. How do I handle that? As I'm trying to join my tables, the customer with no transaction does not show up as shown in code below.
SELECT Orders.Customer_Id, Customers.AcctOpenDate, Customers.CustomerFirstName, Customers.CustomerLastName, Orders.TxnDate, Orders.Amount
FROM Orders
INNER JOIN Customers ON Orders.Customer_Id=Customers.Customer_Id;
I need to be able to account for the customer with no transaction such as querying for least transaction amount.
Use below updated query - Right Outer join is used instead of Inner join to show all customers regardless of the customer placed an order yet.
SELECT Orders.Customer_Id, Customers.AcctOpenDate,
Customers.CustomerFirstName, Customers.CustomerLastName,
Orders.TxnDate, Orders.Amount
FROM Orders
Right Outer JOIN Customers ON Orders.Customer_Id=Customers.Customer_Id;
INNER Joins show only those records that are present in BOTH tables
OUTER joins gets SQL to list all the records present in the designated table and shows NULLs for the fields in the other table that are not present
LEFT OUTER JOIN (the first table)
RIGHT OUTER JOIN (the second table)
FULL OUTER JOIN (all records for both tables)
Get up to speed on the join types and how to handle NULLS and that is 90% of writing SQL script.
Below is the same query with a left join and using ISNULL to turn the amount column into 0 if it has no records present
SELECT Orders.Customer_Id, Customers.AcctOpenDate, Customers.CustomerFirstName, Customers.CustomerLastName
, Orders.TxnDate, ISNULL(Orders.Amount,0)
FROM Customers
LEFT OUTER JOIN Orders ON Orders.Customer_Id=Customers.Customer_Id;
try this :
SELECT Orders.Customer_Id, Customers.AcctOpenDate, Customers.CustomerFirstName, Customers.CustomerLastName, Orders.TxnDate, Orders.Amount
FROM Orders
Right OUTER JOIN Customers ON Orders.Customer_Id=Customers.Customer_Id;
I strongly recommend LEFT JOIN. This keeps all rows in the first table, along with matching columns in the second. If there are no matching rows, these columns are NULL:
SELECT c.Customer_Id, c.AcctOpenDate, c.CustomerFirstName, c.CustomerLastName,
o.TxnDate, o.Amount
FROM Customers c LEFT JOIN
Orders o
ON o.Customer_Id = c.Customer_Id;
Although you could use RIGHT JOIN, I never use RIGHT JOINs, because I find them much harder to follow. The logic of "keep all rows in the first table I read" is relatively simple. The logic of "I don't know which rows I'm keeping until I read the last table" is harder to follow.
Also note that I included table aliases and change the CustomerId to come from customers -- the table where you are keeping all rows.
Using CASE will replace "null" with 0 then you can sum the values. This will count customers with no transactions.
SELECT c.Name,
SUM(CASE WHEN t.ID IS NULL THEN 0 ELSE 1 END) as TransactionsPerCustomer
FROM Customers c
LEFT JOIN Transactions t
ON c.Name = t.customerID
group by c.Name
SELECT c.Name,
SUM(CASE WHEN t.ID IS NULL THEN 0 ELSE 1 END) as numberoftransaction
FROM customers c
LEFT JOIN transactions t
ON c.Name = t.customerID
group by c.Name

How to put conditions on left joins

I have two tables, CustomerCost and Products that look like the following:
I am joining the two tables using the following SQL query:
SELECT custCost.ProductId,
custCost.CustomerCost
FROM CUSTOMERCOST Cost
LEFT JOIN PRODUCTS prod ON Cost.productId =prod.productId
WHERE prod.productId=4
AND (Cost.Customer_Id =2717
OR Cost.Customer_Id IS NULL)
The result of the join is:
joins result
What i want to do is when I pass customerId 2717 it should return only specific customer cost i.e. 258.93, and when customerId does not match then only it should take cost as 312.50
What am I doing wrong here?
You can get your expected output as follows:
SELECT Cost.ProductId,
Cost.CustomerCost
FROM CUSTOMERCOST Cost
INNER JOIN PRODUCTS prod ON Cost.productId = prod.productId
WHERE prod.productId=4
AND Cost.Customer_Id = 2717
However, if you want to allow customer ID to be passed as NULL, you will have to change the last line to AND Cost.Customer_Id IS NULL. To do so dynamically, you'll need to use variables and generate the query based on the input.
The problem in the original query that you have posted is that you have used an alias called custCost which is not present in the query.
EDIT: Actually, you don't even need a join. The CUSTOMERCOST table seems to have both Customer and Product IDs.
You can simply:
SELECT
Cost.ProductId, Cost.CustomerCost
FROM
CUSTOMERCOST Cost
WHERE
Cost.Customer_Id = 2717
AND Cost.productId = 4
You seem to want:
SELECT c.*
FROM CUSTOMERCOST c
WHERE c.productId = 4 AND c.Customer_Id = 2717
UNION ALL
SELECT c.*
FROM CUSTOMERCOST c
WHERE c.productId = 4 AND c.Customer_Id IS NULL AND
NOT EXISTS (SELECT 1 FROM CUSTOMERCOST c2 WHERE c2.productId = 4 AND c2.Customer_Id = 2717);
That is, take the matching cost, if it exists for the customer. Otherwise, take the default cost.
SELECT custCost.ProductId,
custCost.CustomerCost
FROM CUSTOMERCOST Cost
LEFT JOIN PRODUCTS prod
ON Cost.productId =prod.productId
AND (Cost.Customer_Id =2717 OR Cost.Customer_Id IS NULL)
WHERE prod.productId=4
WHERE applies to the joined row. ON controls the join condition.
Outer joins are why FROM and ON were added to SQL-92. The old SQL-89
syntax had no support for them, and different vendors added different,
incompatible syntax to support them.

LEFT JOIN help in sql

I have to make a list of customer who do not have any invoice but have paid an invoice … maybe twice.
But with my code (stated below) it contains everything from the left join. However I only need the lines highlighted with green.
How should I make a table with only the 2 highlights?
Select paymentsfrombank.invoicenumber,paymentsfrombank.customer,paymentsfrombank.value
FROM paymentsfrombank
LEFT OUTER JOIN debtors
ON debtors.value = paymentsfrombank.value
You only want to select columns from paymentsfrombank. So why do you even join?
select invoice_number, customer, value from paymentsfrombank
except
select invoice_number, customer, value from debtors;
(This requires exact matches as in your example, i.e. same amount for the invoice/customer).
There are two issues in your SQL. First, you need to join on Invoice number, not on value, as joining on value is pointless. Second, you need to only pick those payments where there are no corresponding debts, i.e. when you left-join, the table on the right has "null" in the joining column. The SQL would be something like this:
SELECT paymentsfrombank.invoicenumber,paymentsfrombank.customer,paymentsfrombank.value
FROM paymentsfrombank
LEFT OUTER JOIN debtors
ON debtors.InvoiceNumber = paymentsfrombank.InvoiceNumber
WHERE debtors.InvoiceNumber is NULL
in mysql we usually have this way to flip the relation and extract the rows that dosen't have relation.
Select paymentsfrombank.invoicenumber,paymentsfrombank.customer,paymentsfrombank.value
FROM paymentsfrombank
LEFT OUTER JOIN debtors
ON debtors.value = paymentsfrombank.value where debtors.value is null
You can use NOT EXISTS :
SELECT p.*
FROM paymentsfrombank p
WHERE NOT EXISTS (SELECT 1 FROM debtors d WHERE d.invoice_number = p.invoice_number);
However, the LEFT OUTER JOIN would also work if you add filtered with WHERE Clause to filtered out only missing customers that haven't any invoice information :
SELECT p.invoicenumber, p.customer, p.value
FROM paymentsfrombank P LEFT OUTER JOIN
debtors d
ON d.InvoiceNumber = p.InvoiceNumber
WHERE d.InvoiceNumber IS NULL;
Note : I have used table alias (p & d) that makes query to easier read & write.

SQL: Join 2 tables and return multiple rows from second table based on key of first table

I have one table 'Customers', with a key of customerid.
There is another table PaymentTotals which also has a customerid column. This table stores amounts paid by a customer (PaymentAmount) in a given week (weeknumber field). This implies that in the PaymentTotals table there may be several rows for any one customerid, the difference being the weeknumber for any of these rows.
I am trying to build a query in MSSQL that joins the two tables and will return for a given customerid the PaymentAmount for each different weeknumber.
It is not clear to me how to build this query. Any advice? Thanks.
SELECT *
FROM Customers C INNER JOIN PaymentTotals PT
ON C.customerid = PT.customerid
If you have multiple Payments made by one customer in a given week and want to get total by week you could do something like ....
SELECT C.customerid
,PT.WeekNumber
,SUM(PT.Payment_Column) AS TotalPayment
FROM Customers C INNER JOIN PaymentTotals PT
ON C.customerid = PT.customerid
GROUP BY C.customerid, PT.WeekNumber