SQL query for determining unique values - sql

I am trying to display corresponding dates for serial number, sometimes the query doesn't display values as the Partobj column doesn't have unique values.
How could I get serial number which have unique Partobj?
select
ib.Date1,w.Date2,w.Date3
from
table1 w
left outer join
table2 ib on w.Partobj=ib.Partobj
where
ib.SerialNumber = '12we'

Without seeing any sample data you could do something similar to this, where you join on table2 looking for only the DISTINCT records:
select ib.Date1,w.Date2,w.Date3
from table1 w
left outer join
(
SELECT DISTINCT Date1, Partobj, SerialNumber
FROM table2
) ib
on w.Partobj=ib.Partobj
where ib.SerialNumber = '12we'
If you post some sample data from both tables, it would be helpful.

Related

Multiple rows in table with values from another table

I am struggling with following issue:
Table1:
Table2:
Expected result:
Basically I want to multiple rows in dates table with rows from User table. Is it somehow possible? (using TSQL).
You need to apply cross join
select date,month,userid,userno from table1 cross join table2
select Date, Month, USER_ID, ID from
t1 cross join t2

Count rows after joining three tables in PostgreSQL

Suppose I have three tables in PostgreSQL:
table1 - id1, a_id, updated_by_id
table2 - id2, a_id, updated_by_id
Users - id, display_name
Suppose I am using the using the following query:
select count(t1.id1) from table1 t1
left join table2 t2 on (t1.a_id=t2.a_id)
full outer join users u1 t1.updated_by_id=u1.id)
full outer join users u2 t2.updated_by_id=u2.id)
where u1.id=100;
I get 50 as count.
Whereas with:
select count(t1.id1) from table1 t1
left join table2 t2 on (t1.a_id=t2.a_id)
full outer join users u1 t1.updated_by_id=u1.id)
full outer join users u2 t2.updated_by_id=u2.id)
where u2.id=100;
I get only 25 as count.
What is my mistake in the second query? What can I do to get the same count?
My requirement is that there is a single user table, referenced by multiple tables. I want to take the complete list of users and get the count of ids from different tables.
But the table on which I have joined alone returns the proper count but rest of them don't return the proper count. Can anybody suggest a way to modify my second query to get the proper count?
To simplify your logic, aggregate first, join later.
Guessing missing details, this query would give you the exact count, how many times each user was referenced in table1 and table2 respectively for all users:
SELECT *
FROM users u
LEFT JOIN (
SELECT updated_by_id AS id, count(*) AS t1_ct
FROM table1
GROUP BY 1
) t1 USING (id)
LEFT JOIN (
SELECT updated_by_id AS id, count(*) AS t2_ct
FROM table2
GROUP BY 1
) t2 USING (id);
In particular, avoid multiple 1-n relationships multiplying each other when joined together:
Two SQL LEFT JOINS produce incorrect result
To retrieve a single or few users only, LATERAL joins will be faster (Postgres 9.3+):
SELECT *
FROM users u
LEFT JOIN LATERAL (
SELECT count(*) AS t1_ct
FROM table1
WHERE updated_by_id = u.id
) ON true
LEFT JOIN LATERAL (
SELECT count(*) AS t2_ct
FROM table2
WHERE updated_by_id = u.id
) ON true
WHERE u.id = 100;
What is the difference between LATERAL JOIN and a subquery in PostgreSQL?
Explain perceived difference
The particular mismatch you report is due to the specifics of a FULL OUTER JOIN:
First, an inner join is performed. Then, for each row in T1 that does
not satisfy the join condition with any row in T2, a joined row is
added with null values in columns of T2. Also, for each row of T2 that
does not satisfy the join condition with any row in T1, a joined row
with null values in the columns of T1 is added.
So you get NULL values appended on the respective other side for missing matches. count() does not count NULL values. So you can get a different result depending on whether you filter on u1.id=100 or u2.id=100.
This is just to explain, you don't need a FULL JOIN here. Use the presented alternatives instead.

Join SQL Tables with Unique Data (Not same number of columns!)

How can I join three or four SQL tables that DO NOT have an equal amount of rows while ensuring that there are no duplicates of a primary/foreign key?
Structure:
Table1: id, first_name, last_name, email
Table2: id (independent of id in table 1), name, location, table1_id, table2_id
Table3: id, name, location
I want all of the data from table 1, then all of the data from table 2 corresponding with the table1_id without duplicates.
Kind of tricky for a new guy...
Not sure what do you want to do with Table3.
A LEFT JOIN returns all records from the LEFT table, and the matched records from the right table. If there is no match (from the right side), then the result is NULL.
So per example:
SELECT * FROM Table1 AS t
LEFT JOIN Table2 AS tt
ON t.id = tt.id
The LEFT table refers to the table statement before the LEFT JOIN, and the RIGHT table refers to the table statement after the LEFT JOIN. If you want to add in Table3 as well, use the same logic:
SELECT * FROM Table1 AS t
LEFT JOIN Table2 AS tt
ON t.id = tt.id
LEFT JOIN Table3 AS ttt
ON t.id = ttt.id
Note, that I use alias names for the tables (by using AS), so that I can more easily refer to a specific table. For example, t refers to Table1, tt refers to Table2, and ttt refers to Table3.
Joins are often used in SQL, therefore it is useful to look into: INNER JOIN, RIGHT JOIN, FULL JOIN, and SELF JOIN, as well.
Hope this helps.
Good luck with learning!
You will want to use an LEFT JOIN
SELECT * FROM table1 LEFT JOIN table2 ON Table1.ID = Table2.table1_id

How to get key of the maximal record in group?

I have two tables in my database, one holds the names of files, and other holds records of information described in them, inincluding sizes of sections. it can be descrived as:
Table1: id as integer, name as varchar
Table2: recid as integer primary key, file_id as integer, score as float
Between the tables there is an one-to-many link, from Table1.id to table2.file_id. What i need is for every file which name matches a certain pattern retrieve the id of the linked record with the maximum score and the score itself.
So far i have used:
SELECT name,MAX(score)
FROM Table1
LEFT OUTER JOIN Table2 ON Table2.file_id=Table1.id
WHERE name LIKE :pattern
GROUP BY name
but i cannot retrieve the id of the record in Table2 this way.
The dialect i am using is Sqlite.
What query should be used to retrieve data on the record that has maximum score for every file?
Update:
With this query, i am getting close to what i want:
SELECT name,score,recid
FROM Table1
LEFT OUTER JOIN Table2 ON file_id=id
WHERE name LIKE :pattern
GROUP BY name
HAVING score=MAX(score)
However, this leaves out the entries in the first table that have no corresponding entries in the second table out. How can i ensure they are in the end result anyway? Should i use UNION, and if so - how?
This can actually be achieved without a GROUP BY by using a brilliantly simple technique described by #billkarwin here:
SELECT name, t2.score
FROM Table1 t1
LEFT OUTER JOIN Table2 t2 ON t2.file_id = t1.id
LEFT OUTER JOIN Table2 t2copy ON t2copy.file_id = t2.file_id
AND t2.score < t2copy.score
WHERE name LIKE :pattern
AND t2copy.score IS NULL
See SQL Fiddle demo.
I think that you must use a subquery
SELECT name, recid, score
FROM Table1
LEFT OUTER JOIN Table2 ON Table2.file_id=Table1.id
WHERE name LIKE :pattern AND score = (SELECT MAX(score) FROM Table2.score)
I think the easiest way to do this is with a correlated subquery:
SELECT name, recid, score
FROM Table1 LEFT OUTER JOIN
Table2
ON Table2.file_id=Table1.id
WHERE name LIKE :pattern AND
score = (SELECT MAX(t2.score)
FROM Table1 t1 LEFT OUTER JOIN
Table2 t2
ON t2.file_id=t1.id
where t1.name = table1.name
);
Note that you need table aliases to distinguish the tables in the inner query from the outer query. I am guessing which tables the columns are actually coming from.

An issue possibly related to Cursor/Join

Here is my situation:
Table one contains a set of data that uses an id for an unique identifier. This table has a one to many relationship with about 6 other tables such that.
Given Table 1 with Id of 001:
Table 2 might have 3 rows with foreign key: 001
Table 3 might have 12 rows with foreign key: 001
Table 4 might have 0 rows with foreign key: 001
Table 5 might have 28 rows with foreign key: 001
I need to write a report that lists all of the rows from Table 1 for a specified time frame followed by all of the data contained in the handful of tables that reference it.
My current approach in pseudo code would look like this:
select * from table 1
foreach(result) {
print result;
select * from table 2 where id = result.id;
foreach(result2) {
print result2;
}
select * from table 3 where id = result.id
foreach(result3) {
print result3;
}
//continued for each table
}
This means that the single report can run in the neighbor hood of 1000 queries. I know this is excessive however my sql-fu is a little weak and I could use some help.
LEFT OUTER JOIN Tables2-N on Table1
SELECT Table1.*, Table2.*, Table3.*, Table4.*, Table5.*
FROM Table1
LEFT OUTER JOIN Table2 ON Table1.ID = Table2.ID
LEFT OUTER JOIN Table3 ON Table1.ID = Table3.ID
LEFT OUTER JOIN Table4 ON Table1.ID = Table4.ID
LEFT OUTER JOIN Table5 ON Table1.ID = Table5.ID
WHERE (CRITERIA)
Join doesn't do it for me. I hate having to de-tangle the data on the client side. All those nulls from left-joining.
Here's a set-based solution that doesn't use Joins.
INSERT INTO #LocalCollection (theKey)
SELECT id
FROM Table1
WHERE ...
SELECT * FROM Table1 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table2 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table3 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table4 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table5 WHERE id in (SELECT theKey FROM #LocalCollection)
Ah! Procedural! My SQL would look like this, if you needed to order the results from the other tables after the results from the first table.
Insert Into #rows Select id from Table1 where date between '12/30' and '12/31'
Select * from Table1 t join #rows r on t.id = r.id
Select * from Table2 t join #rows r on t.id = r.id
--etc
If you wanted to group the results by the initial ID, use a Left Outer Join, as mentioned previously.
You may be best off to use a reporting tool like Crystal or Jasper, or even XSL-FO if you are feeling bold. They have things built in to handle specifically this. This is not something the would work well in raw SQL.
If the format of all of the rows (the headers as well as all of the details) is the same, it would also be pretty easy to do it as a stored procedure.
What I would do: Do it as a join, so you will have the header data on every row, then use a reporting tool to do the grouping.
SELECT * FROM table1 t1
INNER JOIN table2 t2 ON t1.id = t2.resultid -- this could be a left join if the table is not guaranteed to have entries for t1.id
INNER JOIN table2 t3 ON t1.id = t3.resultid -- etc
OR if the data is all in the same format you could do.
SELECT cola,colb FROM table1 WHERE id = #id
UNION ALL
SELECT cola,colb FROM table2 WHERE resultid = #id
UNION ALL
SELECT cola,colb FROM table3 WHERE resultid = #id
It really depends on the format you require the data in for output to the report.
If you can give a sample of how you would like the output I could probably help more.
Join all of the tables together.
select * from table_1 left join table_2 using(id) left join table_3 using(id);
Then, you'll want to roll up the columns in code to format your report as you see fit.
What I would do is open up cursors on the following queries:
SELECT * from table1 order by id
SELECT * from table1 r, table2 t where t.table1_id = r.id order by r.id
SELECT * from table1 r, table3 t where t.table1_id = r.id order by r.id
And then I would walk those cursors in parallel, printing your results. You can do this because all appear in the same order. (Note that I would suggest that while the primary ID for table1 might be named id, it won't have that name in the other tables.)
Do all the tables have the same format? If not, then if you have to have a report that can display the n different types of rows. If you are only interested in the same columns then it is easier.
Most databases have some form of dynamic SQL. In that case you can do the following:
create temporary table from
select * from table1 where rows within time frame
x integer
sql varchar(something)
x = 1
while x <= numresults {
sql = 'SELECT * from table' + CAST(X as varchar) + ' where id in (select id from temporary table'
execute sql
x = x + 1
}
But I mean basically here you are running one query on your main table to get the rows that you need, then running one query for each sub table to get rows that match your main table.
If the report requires the same 2 or 3 columns for each table you could change the select * from tablex to be an insert into and get a single result set at the end...