Will Inner join allow duplicates? - sql

if join two tables using inner join method will it return duplicate values ?

Yes, if there are duplicate values.
If you have CUSTOMERS table:
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
and ORDERS table as follows:
+-----+---------------------+-------------+--------+
| OID | DATE | CUSTOMER_ID | AMOUNT |
+-----+---------------------+-------------+--------+
| 102 | 2009-10-08 00:00:00 | 3 | 3000 |
| 100 | 2009-10-08 00:00:00 | 3 | 1500 |
| 101 | 2009-11-20 00:00:00 | 2 | 1560 |
| 103 | 2008-05-20 00:00:00 | 4 | 2060 |
+-----+---------------------+-------------+--------+
Then inner join will give result:
SELECT ID, NAME, AMOUNT, DATE
FROM CUSTOMERS
INNER JOIN ORDERS
ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID;
This would produce the following result:
+----+----------+--------+---------------------+
| ID | NAME | AMOUNT | DATE |
+----+----------+--------+---------------------+
| 3 | kaushik | 3000 | 2009-10-08 00:00:00 |
| 3 | kaushik | 1500 | 2009-10-08 00:00:00 |
| 2 | Khilan | 1560 | 2009-11-20 00:00:00 |
| 4 | Chaitali | 2060 | 2008-05-20 00:00:00 |
+----+----------+--------+---------------------+

Here is an example with duplicate rows in both tables.
select * from customers;
id | name | age | address
----+-------------+-----+-----------
1 | Ramesh | 32 | Ahmedabad
2 | Khilan | 25 | Delhi
3 | kaushik | 23 | Kota <-- id 3 "kaushik"
3 | kaushik_two | 23 | Ahmedabad <-- appears twice
4 | Chaitali | 25 | Mumbai
5 | Hardik | 27 | Bhopal
6 | Komal | 22 | MP
7 | Muffy | 24 | Indore
(8 rows)
select * from orders;
oid | date | customer_id | amount
-----+---------------------+-------------+--------
102 | 2009-10-08 00:00:00 | 3 | 3000 <-- reference to customer 3
100 | 2009-10-08 00:00:00 | 3 | 1500 <-- also appears twice
101 | 2009-11-20 00:00:00 | 2 | 1560
103 | 2008-05-20 00:00:00 | 4 | 2060
104 | 2022-01-01 00:00:00 | 100 | 3900
(5 rows)
Inner Join
Produces duplicated rows for "kaushik" and "kaushik_two".
select id, name, amount, date
from customers
inner join orders on customers.id = orders.customer_id;
id | name | amount | date
----+-------------+--------+---------------------
2 | Khilan | 1560 | 2009-11-20 00:00:00
3 | kaushik | 1500 | 2009-10-08 00:00:00 <-- first pair
3 | kaushik | 3000 | 2009-10-08 00:00:00
3 | kaushik_two | 1500 | 2009-10-08 00:00:00 <-- second pair
3 | kaushik_two | 3000 | 2009-10-08 00:00:00
4 | Chaitali | 2060 | 2008-05-20 00:00:00
(6 rows)

Related

What's the shortest method to generate a column of numbers for queries instead of having to count each rows in SQL?

I'm going through some practice questions and had a question asking for number of rows shown as the result of my query and found myself counting each rows for it and thought it was inefficient.
How do I create a new column that numbers the rows from 1 to number of rows?
If my query is as follows,
SELECT *
FROM invoices
WHERE BillingCountry = 'Germany' AND Total > 5
then the result is:
+-----------+------------+---------------------+-------------------------+-------------+--------------+----------------+-------------------+-------+
| InvoiceId | CustomerId | InvoiceDate | BillingAddress | BillingCity | BillingState | BillingCountry | BillingPostalCode | Total |
+-----------+------------+---------------------+-------------------------+-------------+--------------+----------------+-------------------+-------+
| 12 | 2 | 2009-02-11 00:00:00 | Theodor-Heuss-Straße 34 | Stuttgart | None | Germany | 70174 | 13.86 |
| 40 | 36 | 2009-06-15 00:00:00 | Tauentzienstraße 8 | Berlin | None | Germany | 10789 | 13.86 |
| 52 | 38 | 2009-08-08 00:00:00 | Barbarossastraße 19 | Berlin | None | Germany | 10779 | 5.94 |
| 67 | 2 | 2009-10-12 00:00:00 | Theodor-Heuss-Straße 34 | Stuttgart | None | Germany | 70174 | 8.91 |
| 95 | 36 | 2010-02-13 00:00:00 | Tauentzienstraße 8 | Berlin | None | Germany | 10789 | 8.91 |
| 138 | 37 | 2010-08-23 00:00:00 | Berger Straße 10 | Frankfurt | None | Germany | 60316 | 13.86 |
| 193 | 37 | 2011-04-23 00:00:00 | Berger Straße 10 | Frankfurt | None | Germany | 60316 | 14.91 |
| 236 | 38 | 2011-10-31 00:00:00 | Barbarossastraße 19 | Berlin | None | Germany | 10779 | 13.86 |
| 241 | 2 | 2011-11-23 00:00:00 | Theodor-Heuss-Straße 34 | Stuttgart | None | Germany | 70174 | 5.94 |
| 269 | 36 | 2012-03-26 00:00:00 | Tauentzienstraße 8 | Berlin | None | Germany | 10789 | 5.94 |
| 291 | 38 | 2012-06-30 00:00:00 | Barbarossastraße 19 | Berlin | None | Germany | 10779 | 8.91 |
| 367 | 37 | 2013-06-03 00:00:00 | Berger Straße 10 | Frankfurt | None | Germany | 60316 | 5.94 |
+-----------+------------+---------------------+-------------------------+-------------+--------------+----------------+-------------------+-------+
There are 12 rows of information pulled from a dataset, but I only realized it after manually counting the rows.
What can I add in my query that can add a column in the left-most side of the result that shows numbers 1 through 12 for each rows like how Excel would show it as and is there a way to do the same for the columns but in an alphabetical order?
I would use ROW_NUMBER function:
SELECT ROW_NUMBER() OVER (ORDER BY BillingAddress, BillingCity) AS RN, *
FROM invoices
WHERE BillingCountry = 'Germany' AND Total > 5

SQL multiple sum by PARTITION

I have the following postgreSql table stock, there the structure is following
| column | pk |
+--------+-----+
| date | yes |
| id | yes |
| type | yes |
| qty | |
| fee | |
table looks like this
| date | id | type | qty | fee |
+------------+-----+------+------+------+
| 2015-01-01 | 001 | CB04 | 500 | 2 |
| 2015-01-01 | 002 | CB04 | 1500 | 3 |
| 2015-01-01 | 003 | CB04 | 500 | 1 |
| 2015-01-01 | 004 | CB04 | 100 | 5 |
| 2015-01-01 | 001 | CB02 | 800 | 6 |
| 2015-01-02 | 002 | CB03 | 3100 | 1 |
| | | | | |
I want to create a view or query, so that the result looks like this.
| date | type | t_qty | total_weighted_fee |
+------------+------+-------+--------------------+
| 2015-01-01 | CB04 | 2600 | 2.5 |
| 2015-01-01 | CB03 | 3100 | 1 |
| | | | |
what I did is this
http://sqlfiddle.com/#!17/39fb8a/18
But this is not the output what I want.
The Sub Query table looks like this:
% of total Qty = qty / t_qty
weighted fee = fee * % of total Qty
| date | id | type | qty | fee | t_qty | % of total Qty | weighted fee |
+------------+-----+------+------+-----+-------+----------------+--------------+
| 2015-01-01 | 001 | CB04 | 500 | 2 | 2600 | 0.19 | 0.38 |
| 2015-01-01 | 002 | CB04 | 1500 | 3 | 2600 | 0.58 | 1.73 |
| 2015-01-01 | 003 | CB04 | 500 | 1 | 2600 | 0.19 | 0.192 |
| 2015-01-01 | 004 | CB04 | 100 | 5 | 2600 | 0.04 | 0.192 |
| 2015-01-01 | 002 | CB03 | 3100 | 1 | 3100 | 1 | 1 |
| | | | | | | | |
You can use aggregation . . . I don't think you are far off:
select date, type, sum(qty),
sum(fee * qty * 1.0) / nullif(sum(qty), 0)
from t
group by date, type;

How to update column with average weekly value for each day in sql

I have the following table. I insert a column named WeekValue, I want to fill the weekvalue column with the weekly average value of impressionCnt of the same category for each row.
Like:
+-------------------------+----------+---------------+--------------+
| Date | category | impressioncnt | weekAverage |
+-------------------------+----------+---------------+--------------+
| 2014-02-06 00:00:00.000 | a | 123 | 100 |
| 2014-02-06 00:00:00.000 | b | 121 | 200 |
| 2014-02-06 00:00:00.000 | c | 99 | 300 |
| 2014-02-07 00:00:00.000 | a | 33 | 100 |
| 2014-02-07 00:00:00.000 | b | 456 | 200 |
| 2014-02-07 00:00:00.000 | c | 54 | 300 |
| 2014-02-08 00:00:00.000 | a | 765 | 100 |
| 2014-02-08 00:00:00.000 | b | 78 | 200 |
| 2014-02-08 00:00:00.000 | c | 12 | 300 |
| ..... | | | |
| 2014-03-01 00:00:00.000 | a | 123 | 111 |
| 2014-03-01 00:00:00.000 | b | 121 | 222 |
| 2014-03-01 00:00:00.000 | c | 99 | 333 |
| 2014-03-02 00:00:00.000 | a | 33 | 111 |
| 2014-03-02 00:00:00.000 | b | 456 | 222 |
| 2014-03-02 00:00:00.000 | c | 54 | 333 |
| 2014-03-03 00:00:00.000 | a | 765 | 111 |
| 2014-03-03 00:00:00.000 | b | 78 | 222 |
| 2014-03-03 00:00:00.000 | c | 12 | 333 |
+-------------------------+----------+---------------+--------------+
I tried
update [dbo].[RetailTS]
set Week = datepart(day, dateDiff(day, 0, [Date])/7 *7)/7 +1
To get the week numbers then try to group by the week week number and date and category, but this seems isn't correct. How do I write the SQL query? Thanks!
Given that you may be adding more data in the future, thus requiring another update, you might want to just select out the weekly averages:
SELECT
Date,
category,
impressioncnt,
AVG(impressioncnt) OVER
(PARTITION BY category, DATEDIFF(d, 0, Date) / 7) AS weekAverage
FROM RetailTS
ORDER BY
Date, category;

Count rows each month of a year - SQL Server

I have a table "Product" as :
| ProductId | ProductCatId | Price | Date | Deadline |
--------------------------------------------------------------------
| 1 | 1 | 10.00 | 2016-01-01 | 2016-01-27 |
| 2 | 2 | 10.00 | 2016-02-01 | 2016-02-27 |
| 3 | 3 | 10.00 | 2016-03-01 | 2016-03-27 |
| 4 | 1 | 10.00 | 2016-04-01 | 2016-04-27 |
| 5 | 3 | 10.00 | 2016-05-01 | 2016-05-27 |
| 6 | 3 | 10.00 | 2016-06-01 | 2016-06-27 |
| 7 | 1 | 20.00 | 2016-01-01 | 2016-01-27 |
| 8 | 2 | 30.00 | 2016-02-01 | 2016-02-27 |
| 9 | 1 | 40.00 | 2016-03-01 | 2016-03-27 |
| 10 | 4 | 15.00 | 2016-04-01 | 2016-04-27 |
| 11 | 1 | 25.00 | 2016-05-01 | 2016-05-27 |
| 12 | 5 | 55.00 | 2016-06-01 | 2016-06-27 |
| 13 | 5 | 55.00 | 2016-06-01 | 2016-01-27 |
| 14 | 5 | 55.00 | 2016-06-01 | 2016-02-27 |
| 15 | 5 | 55.00 | 2016-06-01 | 2016-03-27 |
I want to create SP count rows of Product each month with condition Year = CurrentYear , like :
| Month| SumProducts | SumExpiredProducts |
-------------------------------------------
| 1 | 3 | 3 |
| 2 | 3 | 3 |
| 3 | 3 | 3 |
| 4 | 2 | 2 |
| 5 | 2 | 2 |
| 6 | 2 | 2 |
What should i do ?
You can use a query like the following:
SELECT MONTH([Date]),
COUNT(*) AS SumProducts ,
COUNT(CASE WHEN [Date] > Deadline THEN 1 END) AS SumExpiredProducts
FROM mytable
WHERE YEAR([Date]) = YEAR(GETDATE())
GROUP BY MONTH([Date])

Is it possible to see the 'null' in the table in sql instead of blank

+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | |
| 7 | Muffy | 24 | Indore | |
+----+----------+-----+-----------+----------+
how to print the null instead of blank space in the above table in id 6,7 for salary column while inserting the values.
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | null |
| 7 | Muffy | 24 | Indore | null |
+----+----------+-----+-----------+----------+
You can use ISNULL() or IFNULL() in your SELECT depending on the RDBMS. Your query would look something like this:
SELECT ID, NAME, AGE, ADDRESS, IFNULL(SALARY, "null") FROM YOURTABLE
Oracle equivalent of neelsg's answer,
SELECT id, name, age, address, NVL(salary, "null") FROM yourtable