Concise SQL for joining many tables on a single column - sql

I have about 122 tables that all share a particular column. Is there an elegant/concise method to join all of these tables on that column without having 121 instances of
join on A.id = B.id
in the query?

If the column in question has the same name in both tables (which it should) then you can use this shorter syntax:
SELECT ... FROM table1 JOIN table2 USING (column)
The column will also appear only once in the result, instead of being present for each table. More details here.
You will still have to do it for each table, though.

Here goes your solution:
Create table and insert statement:
create table splitUpdate (no int,productname varchar(10),productcrossell varchar(20));
insert into splitUpdate values (1,'a','a(1)');
insert into splitUpdate values (2,null,'c(4),d(5)');
insert into splitUpdate values (3,null,'Z(1),b(2)');
create table eleminate (product varchar(20));
insert into eleminate values('x');
insert into eleminate values('y');
insert into eleminate values('Z');
insert into eleminate values('z');
Update Query:
with cte as (
select no,productname,p.product,row_number()over(partition by no)rn ,substring(p.product from 1 for position('(' in p.product )-1) SplittedProduct
from splitupdate t, unnest(string_to_array(t.productcrossell ,','))p(product)
where substring(p.product from 1 for position('(' in p.product )-1) not in (select product from eleminate))
update splitupdate set productname=splittedproduct
from cte
where splitupdate.productname is null and splitupdate.no=cte.no and cte.rn=1
SplitUpdate Table before updating:
|no|productname|productcrossell|
|1 |a |a(1) |
|2 |c |c(4),d(5) |
|3 |b |Z(1),b(2) |
Result:
|no|productname|productcrossell|
|1 |a |a(1) |
|2 |c |c(4),d(5) |
|3 |b |Z(1),b(2) |

Related

Number of expected rows in sql when we perform an inner join

Lets say we have a table A with m rows
table B with n rows and m>n. What is the max and min number of rows returned when we perform an inner join?
I know that min will be 0 since the inner join returns the common rows and there could be no possible common row between the two. But what will be the max, is it n or m-n?
Also what is the max and min rows returned in a left join in the same scenario? is it m for both?
That all depends, if you assume that every row matches at most one other row, then the min number of rows is 0 and the max number of rows is min(m, n). If it is possible for a row from A to match with multiple rows from B, then the max explodes to m * n, if every row in A matches every row in B.
The following returns 3 rows, since the matches are direct.
WITH a(id, name) AS (
SELECT *
FROM (VALUES (1, 'Ringo'),
(2, 'George'),
(3, 'Paul'),
(4, 'John')) as a
), b(id, food) AS (
SELECT *
FROM (VALUES (1, 'eggs'),
(2, 'ham'),
(3, 'spam')) as b
)
SELECT *
FROM a
INNER JOIN b ON a.id = b.id;
+--+------+--+---------+
|id|name |id|food |
+--+------+--+---------+
|1 |Ringo |1 |Eggs |
|2 |George|2 |Ham |
|3 |Paul |3 |Spam |
+--+------+--+---------+
But this returns many more rows.
WITH a(id, name) AS (
SELECT *
FROM (VALUES (1, 'Ringo'),
(2, 'George'),
(3, 'Paul'),
(4, 'John')) as a
), b(id, food) AS (
SELECT *
FROM (VALUES (1, 'Eggs'),
(2, 'Ham'),
(3, 'Spam')) as b
)
SELECT *
FROM a
INNER JOIN b ON b.food <= a.name
+--+------+--+---------+
|id|name |id|food |
+--+------+--+---------+
|1 |Ringo |1 |Eggs |
|2 |George|1 |Eggs |
|3 |Paul |1 |Eggs |
|4 |John |1 |Eggs |
|1 |Ringo |2 |Ham |
|3 |Paul |2 |Ham |
|4 |John |2 |Ham |
+--+------+--+---------+
It is often assumed that the joining row values are unique, but need not be the case. The Venn diagrams often used to represent joins are often interpreted with this in mind, but are generally misleading. I like to think of these in a few cases.
Case1: row values are unique for each table, and it is assumed there are common rows between the tables
This is maybe most typical. Here min row count is zero (one if assumed there is some row intersection); if all rows are expected to be contained in the larger table, then row count = min(m, n).
Case2: there are no expectations of uniqueness for (joining) row values
In the most degenerate case, assume all rows m, n have identical value. In this case, the max number of output rows (matches) is the same as a cross join: row count = m*n.
I find it's easiest to think of Inner and Left/Right/Full Outer joins as a subtractive process from the cross join (Cartesian product). The best explanation I've seen anywhere is given by Martin Smith's answer here.
The maximum number of rows is generated when all the key values are the same. In this case, the inner join is equivalent to a cross join, and the maximum number is m * n.
The maximum number for a left or right outer join is basically the same, with just a caveat that an outer join is guaranteed to return results even when one of the tables is empty. So that maximum is expressed as greatest(m, n, m * n).

Fetch data from multiple tables in postgresql

I am working on an application where I want to fetch the records from multiple tables which are connected through foreign key. The query I am using is
select ue.institute, ue.marks, uf.relation, uf.name
from user_education ue, user_family uf where ue.user_id=12 and uf.user_id=12
The result of the query is
You can see the data is repeating in it. I only want a record one time. I want no repetition. I want something like this
T1 T2
id|name|fid id|descrip| fid
1 |A |1 1|DA | 1
2 |B |1 2|DB | 1
2 |B |1
Result which I want:
Result:
id|name|fid|id|descrip| fid
1 |A |1 |1|DA | 1
2 |B |1 |2|DB | 1
2 |B |1 |
The results fetched through your query
The total rows are 5
More Information
I want the rows of same user_id from both tables but you can see in T1 there are 3 rows and in T2 there are 2 rows. I do not want repetitions but also I want to fetch all the data on the basis of user_id
Table Schemas,s
T1
T2
I can't see why you would want that, but the solution could be to use the window function row_number():
SELECT ue.institute, ue.marks, uf.relation, uf.name
FROM (SELECT institute, marks, row_number() OVER ()
FROM user_education
WHERE user_id=12) ue
FULL OUTER JOIN
(SELECT relation, name, row_number() OVER ()
FROM user_family
WHERE user_id=12) uf
USING (row_number);
The result would be pretty meaningless though, as there is no ordering defined in the individual result sets.

Aggregate multiple select statements without replicating data

How do I aggregate 2 select clauses without replicating data.
For instance, suppose I have tab_a that contains the data from 1 to 10:
|id|
|1 |
|2 |
|3 |
|. |
|. |
|10|
And then, I want to generate the combination of tab_b and tab_c making sure that result has 10 lines and add the column of tab_a to the result tuple
Script:
SELECT tab_b.id, tab_c.id, tab_a.id
from tab_b, tab_c, tab_a;
However this is replicating data from tab_a for each combination of tab_b and tab_c, I only want to add and would that for each combination of tab_b x tab_c I add a row of tab_a.
Example of data from tab_b
|id|
|1 |
|2 |
Example of data from tab_c
|id|
|1 |
|2 |
|3 |
|4 |
|5 |
I would like to get this output:
|tab_b.id|tab_c.id|tab_a.id|
|1 |1 |1 |
|2 |1 |2 |
|1 |2 |3 |
|... |... |... |
|2 |5 |10 |
Your question includes an unstated, invalid assumption: that the position of the values in the table (the row number) is meaningful in SQL. It's not. In SQL, rows have no order. All joins -- everything, in fact -- are based on values. To join tables, you have to supply the values the DBMS should use to determine which rows go together.
You got a hint of that with your attempted join: from tab_b, tab_c, tab_a. You didn't supply any basis for joining the rows, which in SQL means there's no restriction: all rows are "the same" for the purpose of this join. They all match, and voila, you get them all!
To do what you want, redesign your tables with at least one more column: the key that serves to identify the value. It could be a number; for example, your source data might be an array. More commonly each value has a name of some kind.
Once you have tables with keys, I think you'll find the join easier to write and understand.
Perhaps you're new to SQL, but this is generally not the way things are done with RDBMSs. Anyway, if this is what you need, PostgreSQL can deal with it nicely, using different strategies:
Window Functions:
with
tab_a (id) as (select generate_series(1,10)),
tab_b (id) as (select generate_series(1,2)),
tab_c (id) as (select generate_series(1,5))
select tab_b_id, tab_c_id, tab_a.id
from (select *, row_number() over () from tab_a) as tab_a
left join (
select tab_b.id as tab_b_id, tab_c.id as tab_c_id, row_number() over ()
from tab_b, tab_c
order by 2, 1
) tabs_b_c ON (tabs_b_c.row_number = tab_a.row_number)
order by tab_a.id;
Arrays:
with
tab_a (id) as (select generate_series(1,10)),
tab_b (id) as (select generate_series(1,2)),
tab_c (id) as (select generate_series(1,5))
select bc[s][1], bc[s][2], a[s]
from (
select array(
select id
from tab_a
order by 1
) a,
array(
select array[tab_b.id, tab_c.id]
from tab_b, tab_c
order by tab_c.id, tab_b.id
) bc
) arr
join lateral generate_subscripts(arr.a, 1) s on true
If i understand your question correctly maybe this is what you are looking for ..
SELECT bctable.b_id, bctable.c_id, atable.a_id
FROM (SELECT a_id, ROW_NUMBER () OVER () AS arnum FROM a) atable
JOIN (SELECT p.b_id, p.c_id, ROW_NUMBER () OVER () AS bcrnum
FROM ( SELECT b.b_id, c.c_id
FROM b CROSS JOIN c
ORDER BY c.c_id, b.b_id) p) bctable
ON atable.arnum = bctable.bcrnum
Please check the SQLFiddle .

Merge inserted values against temp table leaving out null values

In my process I need to create an INSTEAD OF INSERT trigger which will accept the new values for the record and create a new version of it, example:
Note: Table columns are constantly changing due to business requirements. So if there is a solution that would support table to table merge instead of column to column that would be awesome.
ExampleTable
VersionID |ID |Value 1 |Value 2
1 |1 |abc | 123
Example query
INSERT INTO ExampleTable (ID,[Value 1]) VALUES (1,'testabc')
Resulting table:
VersionID |ID |Value 1 |Value 2
1 |1 |abc | 123
2 |1 |testabc | 123
At this moment I have something like this:
-- Get data
SELECT TOP 1 * INTO #ExistingData FROM dbo.ExampleTableLatestVersionView
WHERE ID = #ID
-- Merge incoming data
MERGE #ExistingData AS target
USING inserted as source
ON (target.ID= source.ID)
WHEN MATCHED
THEN UPDATE SET target.[Value 1] = source.[Value 1],
target.[Value 2] = source.[Value 2];
-- And afterwards I do a new insert into version table
Problem here is that NULL values from inserted table are overwriting and I end up with this:
VersionID |ID |Value 1 |Value 2
1 |1 |abc | 123
2 |1 |testabc | NULL
I was thinking of doing INSTEAD OF UPDATE where I could get previous values by referencing VersionID, but I want to know if this is possible.
This will use the existing value if provided value is null:
MERGE #ExistingData AS target
USING inserted as source
ON (target.ID= source.ID)
WHEN MATCHED
THEN UPDATE SET target.[Value 1] = ISNULL( source.[Value 1],target.[Value 1]),
target.[Value 2] = ISNULL( source.[Value 2],target.[Value 2]);

Crosstab SQL query for 2 tables? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I have 2 tables and I need to make one view of them like if it was 1 single table
Table1 DEVICE
+-----+-------+-----------+
|DevID|DevName|DevIP |
+-----+-------+-----------+
|1 |HH1 |192.168.1.1|
+-----+-------+-----------+
|2 |HH2 |192.168.1.2|
+-----+-------+-----------+
Table2 DEVICECUSTOMDATA
+-----+------------+--------+
|DevID|Name |Value |
+-----+------------+--------+
|1 |Model |CN70 |
+-----+------------+--------+
|1 |BuildVersion|1.2 |
+-----+------------+--------+
|1 |BuildDate |20140113|
+-----+------------+--------+
|2 |Model |MC55 |
+-----+------------+--------+
|2 |BuildVersion|1.2 |
+-----+------------+--------+
|2 |BuildDate |20140110|
+-----+------------+--------+
The resulting table should be:
+-----+-------+-----------+-----+------------+---------+
|DevID|DevName|DevIP |Model|BuildVersion|BuildDate|
+-----+-------+-----------+-----+------------+---------+
|1 |HH1 |192.168.1.1|CN70 |1.2 |20140113 |
+-----+-------+-----------+-----+------------+---------+
|2 |HH2 |192.168.1.2|MC55 |1.2 |20140110 |
+-----+-------+-----------+-----+------------+---------+
I would appreciate any help to do this. Thanks
SQL SERVER:
See SqlFiddle:
SELECT d.DevId, d.DevName, d.DevIp, p.Model, p.BuildVersion, p.BuildDate
FROM DEVICE d
JOIN (
SELECT *
FROM DEVICECUSTOMDATA
PIVOT (MAX(Value) FOR Name IN ([Model], [BuildVersion], [BuildDate])) as Something) p
on d.DevId = p.DevId
Working Online Example (SQL Server Syntax): SQL Fiddle
Result:
SQL Script:
DECLARE #Device TABLE (
DevID int not null,
DevName varchar(max) not null,
DevIP varchar(max) not null
)
insert into #Device values ('1', 'HH1','192.168.1.1')
insert into #Device values ('2', 'HH2','192.168.1.2')
DECLARE #DeviceCustomData TABLE (
CDevID int not null,
Name varchar(max) not null,
Value varchar(max) not null
)
insert into #DeviceCustomData
values ('1','Model','CN70')
insert into #DeviceCustomData
values ('1','BuildVersion','1.2')
insert into #DeviceCustomData
values ('1','BuildDate','20140113')
insert into #DeviceCustomData
values ('2','Model','MC55')
insert into #DeviceCustomData
values ('2','BuildVersion','1.2')
insert into #DeviceCustomData
values ('2','BuildDate','20140110')
SELECT *
FROM
(SELECT d.DevID, d.DevName, d.DevIP, c.Value, c.Name
FROM #Device d
inner join #DeviceCustomData c on d.DevID = c.CDevID) AS SourceTable
PIVOT(
MIN(Value)
FOR Name in ([Model],[BuildVersion],[BuildDate])
) as PivotTable
Reference: http://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx