Oracle sql Inner join first record in right table - sql

my question is this:
I have two tables such as this:
username | portname | symbol | shares
---------+----------+--------+-------
phil | test | APL | 214
---------+----------+--------+--------
It has more records, but that's just an example. Then I have another table such as this, that has multiple records per symbol
symbol | high | low | timestamp
-------+------+-----+-----------
APL | 200 | 20 | *timestamp object
APL | 400 | 34 | *timestamp object
I want a table to be returned where I join the two, but only the first row from the second table is joined so something like this is returned:
symbol | high | low | timestamp
-------+------+-----+----------
APL | 400 | 34 | *timestamp object
So only one record from the right table is matched. I've tried alot of things but haven't gotten anything to work with group by's or distinct.
Thanks!

SELECT t1.symbol, t3.high, t3.low, t3.timestamp
FROM Table1 t1
JOIN (
SELECT inn.*
FROM (SELECT t2.*, (ROW_NUMBER() OVER(PARTITION BY symbol ORDER BY timestamp DESC)) As Rank
FROM Table2 t2) inn
WHERE inn.Rank=1
) t3
ON t1.symbol = t3.symbol;
See SQL Fiddle

Related

Comparing aggregated columns to non aggregated columns to remove matches

I have two separate tables from two different databases that are performing a matching check.
If the values match I want them out of the result set. The first table (A) has multiple entries that contain the same symbol matches for the matching columns in the second table (B).
The entries in table B, if added up will ideally equal the value of one of the matching rows of A.
The tables look like below when queried separately.
Underneath the tables is what my query currently looks like. I thought if I group the columns by the symbols I could use the SUM of B to add up to the value of A which would get rid of the entries. However, I think because I am summing from B and not from A, then the A doesn't count as an aggregated column so must be included in the group by and doesn't allow for the summing to work in the way I'm wanting it to calculate.
How would I be able to run this query so the values in B are all summed up. Then, if matching to the symbol/value from any of the entries in A, don't get included in the result set?
Table A
| Symbol | Value |
|--------|-------|
| A | 1000 |
| A | 1000 |
| B | 1440 |
| B | 1440 |
| C | 1235 |
Table B
| Symbol | Value |
|--------|-------|
| A | 750 |
| A | 250 |
| B | 24 |
| B | 1416|
| C | 1874|
SELECT DBA.A, DBB.B
FROM DatabaseA DBA
INNER JOIN DatabaseB DBB on DBA.Symbol = DBB.Symbol
and DBA.Value != DBB.Value
group by DBA.Symbol, DBB.Symbol, DBB.Value
having SUM(DBB.Value) != DBA.Value
order by Symbol, Value
Edited to add ideal results
Table C
| SymbolB| ValueB| SymbolA | ValueA |
|--------|-------|---------|--------|
| C | 1874 | C | 1235 |
Wherever B adds up to A remove both. If they don't add, leave number inside result set
I will use CTE and use this common table expression (CTE) to search in Table A. Then join table A and table B on symbol.
WITH tDBB as (
SELECT DBB.Symbol, SUM(DBB.Value) as total
FROM tableB as DBB
GROUP BY DBB.Symbol
)
SELECT distinct DBB.Symbol as SymbolB, DBB.Value as ValueB, DBA.Symbol as SymbolA, DBA.Value as ValueA
FROM tableA as DBA
INNER JOIN tableB as DBB on DBA.Symbol = DBB.Symbol
WHERE DBA.Symbol in (Select Symbol from tDBB)
AND NOT DBA.Value in (Select total from tDBB)
Result:
|symbolB |valueB |SymbolA |ValueA |
|--------|-------|--------|-------|
| C | 1874 | C | 1235 |
with t3 as (
select symbol
,sum(value) as value
from t2
group by symbol
)
select *
from t3 join t on t.symbol = t3.symbol and t.value != t3.value
symbol
value
Symbol
Value
C
1874
C
1235
Fiddle

SQL script runs VERY slowly with small change

I am relatively new to SQL. I have a script that used to run very quickly (<0.5 seconds) but runs very slowly (>120 seconds) if I add one change - and I can't see why this change makes such a difference. Any help would be hugely appreciated!
This is the script and it runs quickly if I do NOT include "tt2.bulk_cnt
" in line 26:
with bulksum1 as
(
select t1.membercode,
t1.schemecode,
t1.transdate
from mina_raw2 t1
where t1.transactiontype in ('RSP','SP','UNTV','ASTR','CN','TVIN','UCON','TRAS')
group by t1.membercode,
t1.schemecode,
t1.transdate
),
bulksum2 as
(
select t1.schemecode,
t1.transdate,
count(*) as bulk_cnt
from bulksum1 t1
group by t1.schemecode,
t1.transdate
having count(*) >= 10
),
results as
(
select t1.*, tt2.bulk_cnt
from mina_raw2 t1
inner join bulksum2 tt2
on t1.schemecode = tt2.schemecode and t1.transdate = tt2.transdate
where t1.transactiontype in ('RSP','SP','UNTV','ASTR','CN','TVIN','UCON','TRAS')
)
select * from results
EDIT: I apologise for not putting enough detail in here previously - although I can use basic SQL code, I am a complete novice when it comes to databases.
Database: Oracle (I'm not sure which version, sorry)
Execution plans:
QUICK query:
Plan hash value: 1712123489
---------------------------------------------
| Id | Operation | Name |
---------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | HASH JOIN | |
| 2 | VIEW | |
| 3 | FILTER | |
| 4 | HASH GROUP BY | |
| 5 | VIEW | VM_NWVW_0 |
| 6 | HASH GROUP BY | |
| 7 | TABLE ACCESS FULL| MINA_RAW2 |
| 8 | TABLE ACCESS FULL | MINA_RAW2 |
---------------------------------------------
SLOW query:
Plan hash value: 1298175315
--------------------------------------------
| Id | Operation | Name |
--------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | FILTER | |
| 2 | HASH GROUP BY | |
| 3 | HASH JOIN | |
| 4 | VIEW | VM_NWVW_0 |
| 5 | HASH GROUP BY | |
| 6 | TABLE ACCESS FULL| MINA_RAW2 |
| 7 | TABLE ACCESS FULL | MINA_RAW2 |
--------------------------------------------
A few observations, and then some things to do:
1) More information is needed. In particular, how many rows are there in the MINA_RAW2 table, what indexes exist on this table, and when was the last time it was analyzed? To determine the answers to these questions, run:
SELECT COUNT(*) FROM MINA_RAW2;
SELECT TABLE_NAME, LAST_ANALYZED, NUM_ROWS
FROM USER_TABLES
WHERE TABLE_NAME = 'MINA_RAW2';
From looking at the plan output it looks like the database is doing two FULL SCANs on MINA_RAW2 - it would be nice if this could be reduced to no more than one, and hopefully none. It's always tough to tell without very detailed information about the data in the table, but at first blush it appears that an index on TRANSACTIONTYPE might be helpful. If such an index doesn't exist you might want to consider adding it.
2) Assuming that the statistics are out-of-date (as in, old, nonexistent, or a significant amount of data (> 10%) has been added, deleted, or updated since the last analysis) run the following:
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(owner => 'YOUR-SCHEMA-NAME',
table_name => 'MINA_RAW2');
END;
substituting the correct schema name for "YOUR-SCHEMA-NAME" above. Remember to capitalize the schema name! If you don't know if you should or shouldn't gather statistics, err on the side of caution and do it. It shouldn't take much time.
3) Re-try your existing query after updating the table statistics. I think there's a fair chance that having up-to-date statistics in the database will solve your issues. If not:
4) This query is doing a GROUP BY on the results of a GROUP BY. This doesn't appear to be necessary as the initial GROUP BY doesn't do any grouping - instead, it appears this is being done to get the unique combinations of MEMBERCODE, SCHEMECODE, and TRANSDATE so that the count of the members by scheme and date can be determined. I think the whole query can be simplified to:
WITH cteWORKING_TRANS AS (SELECT *
FROM MINA_RAW2
WHERE TRANSACTIONTYPE IN ('RSP','SP','UNTV',
'ASTR','CN','TVIN',
'UCON','TRAS')),
cteBULKSUM AS (SELECT a.SCHEMECODE,
a.TRANSDATE,
COUNT(*) AS BULK_CNT
FROM (SELECT DISTINCT MEMBERCODE,
SCHEMECODE,
TRANSDATE
FROM cteWORKING_TRANS) a
GROUP BY a.SCHEMECODE,
a.TRANSDATE)
SELECT t.*, b.BULK_CNT
FROM cteWORKING_TRANS t
INNER JOIN cteBULKSUM b
ON b.SCHEMECODE = t.SCHEMECODE AND
b.TRANSDATE = t.TRANSDATE
I managed to remove an unnecessary subquery, but this syntax with distinct inside count may not work outside of PostgreSQL or may not be the desired result. I know I've certainly used it there.
select t1.*, tt2.bulk_cnt
from mina_raw2 t1
inner join (select t2.schemecode,
t2.transdate,
count(DISTINCT membercode) as bulk_cnt
from mina_raw2 t2
where t2.transactiontype in ('RSP','SP','UNTV','ASTR','CN','TVIN','UCON','TRAS')
group by t2.schemecode,
t2.transdate
having count(DISTINCT membercode) >= 10) tt2
on t1.schemecode = tt2.schemecode and t1.transdate = tt2.transdate
where t1.transactiontype in ('RSP','SP','UNTV','ASTR','CN','TVIN','UCON','TRAS')
When you use those with queries, instead of subqueries when you don't need to, you're kneecapping the query optimizer.

How to search max value from group in sql

I am just learning some SQL, so I have a question.
-I have a table with name TABL
-a variable :ccname which has a value "Bottle"
The table is as follows:
+----------+---------+-------+--------+
| Name | Price | QTY | CODE |
+----------+---------+-------+--------+
| Rope | 3.6 | 35 | 236 |
| Chain | 2.8 | 15 | 237 |
| Paper | 1.6 | 45 | 124 |
| Bottle | 4.5 | 41 | 478 |
| Bottle | 1.8 | 12 | 123 |
| Computer | 1450.75 | 71 | 784 |
| Spoon | 0.7 | 10 | 412 |
| Bottle | 1.3 | 15 | 781 |
| Rope | 0.9 | 14 | 965 |
+----------+---------+-------+--------+
Now I want to find the CODE from the variable :ccname with the higher quantity! So I translated like this:
SELECT CODE
FROM TABL
GROUP BY :ccname
WHERE QTY=MAX(QTY)
In a perfect world that would turn as a result 478.
In the SQL world what should I write in order to get 478?
You probably want something like that:
SELECT code
FROM TABL
WHERE Name=:ccname
ORDER BY QTY DESC
LIMIT 1
The idea is we find all rows of the table whose Name column is the same as the contents of the variable :ccname, then order them by the quantity in descending order, and filally we select first one, which has to be the one with the largest quantity because they are sorted in descending order.
Try this
SELECT CODE
FROM TABLENAme
WHERE QTY = (SELECT MAX(QTY) FROM TablName WHERE Name = :ccname)
Use ORDER BY, a proper WHERE, and the something to limit the result set to one row:
SELECT CODE
FROM TABL
WHERE name = :ccname
ORDER BY QTY DESC
FETCH FIRST 1 ROW ONLY;
Note: Some databases spell the ANSI standard FETCH FIRST 1 ROW ONLY as LIMIT or as SELECT TOP 1.
Depending on your specific database, you can use one of the following options to restrict your result set to a single value after ordering your existing columns through an ORDER BY clause:
SELECT TOP 1
LIMIT 1
FETCH FIRST 1 ROW ONLY
Syntax Examples
SELECT TOP 1 Code
FROM TABL
WHERE Name = :ccname
ORDER BY QTY DESC
or
SELECT Code
FROM TABL
WHERE Name = :ccname
ORDER BY QTY DESC
LIMIT 1
or
SELECT CODE
FROM TABL
WHERE Name = :ccname
ORDER BY QTY DESC
FETCH FIRST 1 ROW ONLY;
Using join can also effectively solve the question:
Select t1.Code
From TABL As t1 Join (
Select Name, Max(table.QTY) as MaxQTY
From TABL
Where Name = :ccname
Group by Name
) As t2
Where t1.QTY = t2.MaxQTY And t1.Name = t2.Name
Explanation:
You first calculate the maximum value for "Bottle" using the subquery and then join the two tables to select corresponding row with MaxQTY and same name.

MIN() Function in SQL

Need help with Min Function in SQL
I have a table as shown below.
+------------+-------+-------+
| Date_ | Name | Score |
+------------+-------+-------+
| 2012/07/05 | Jack | 1 |
| 2012/07/05 | Jones | 1 |
| 2012/07/06 | Jill | 2 |
| 2012/07/06 | James | 3 |
| 2012/07/07 | Hugo | 1 |
| 2012/07/07 | Jack | 1 |
| 2012/07/07 | Jim | 2 |
+------------+-------+-------+
I would like to get the output like below
+------------+------+-------+
| Date_ | Name | Score |
+------------+------+-------+
| 2012/07/05 | Jack | 1 |
| 2012/07/06 | Jill | 2 |
| 2012/07/07 | Hugo | 1 |
+------------+------+-------+
When I use the MIN() function with just the date and Score column I get the lowest score for each date, which is what I want. I don't care which row is returned if there is a tie in the score for the same date. Trouble starts when I also want name column in the output. I tried a few variation of SQL (i.e min with correlated sub query) but I have no luck getting the output as shown above. Can anyone help please:)
Query is as follows
SELECT DISTINCT
A.USername, A.Date_, A.Score
FROM TestTable AS A
INNER JOIN (SELECT Date_,MIN(Score) AS MinScore
FROM TestTable
GROUP BY Date_) AS B
ON (A.Score = B.MinScore) AND (A.Date_ = B.Date_);
Use this solution:
SELECT a.date_, MIN(name) AS name, a.score
FROM tbl a
INNER JOIN
(
SELECT date_, MIN(score) AS minscore
FROM tbl
GROUP BY date_
) b ON a.date_ = b.date_ AND a.score = b.minscore
GROUP BY a.date_, a.score
SQL-Fiddle Demo
This will get the minimum score per date in the INNER JOIN subselect, which we use to join to the main table. Once we join the subselect, we will only have dates with names having the minimum score (with ties being displayed).
Since we only want one name per date, we then group by date and score, selecting whichever name: MIN(name).
If we want to display the name column, we must use an aggregate function on name to facilitate the GROUP BY on date and score columns, or else it will not work (We could also use MAX() on that column as well).
Please learn about the GROUP BY functionality of RDBMS.
SELECT Date_,Name,MIN(Score)
FROM T
GROUP BY Name
This makes the assumption that EACH NAME and EACH date appears only once, and this will only work for MySQL.
To make it work on other RDBMSs, you need to apply another group function on the Date column, like MAX. MIN. etc
SELECT T.Name, T.Date_, MIN(T.Score) as Score FROM T
GROUP BY T.Date_
Edit: This answer is not corrected as pointed out by JNK in comments
SELECT Date_,MAX(Name),MIN(Score)
FROM T
GROUP BY Date_
Here I am using MAX(NAME), it will pick one name if two names were found with the same goal numbers.
This will find Min score for each day (no duplicates), scored by any player. The name that starts with Z will be picked first than the name that starts with A.
Edit: Fixed by removing group by name

SQL - Select unique rows from a group of results

I have wrecked my brain on this problem for quite some time. I've also reviewed other questions but was unsuccessful.
The problem I have is, I have a list of results/table that has multiple rows with columns
| REGISTRATION | ID | DATE | UNITTYPE
| 005DTHGP | 172 | 2007-09-11 | MBio
| 005DTHGP | 1966 | 2006-09-12 | Tracker
| 013DTHGP | 2281 | 2006-11-01 | Tracker
| 013DTHGP | 2712 | 2008-05-30 | MBio
| 017DTNGP | 2404 | 2006-10-20 | Tracker
| 017DTNGP | 508 | 2007-11-10 | MBio
I am trying to select rows with unique REGISTRATIONS and where the DATE is max (the latest). The IDs are not proportional to the DATE, meaning the ID could be a low value yet the DATE is higher than the other matching row and vise-versa. Therefore I can't use MAX() on both the DATE and ID and grouping just doesn't seem to work.
The results I want are as follows;
| REGISTRATION | ID | DATE | UNITTYPE
| 005DTHGP | 172 | 2007-09-11 | MBio
| 013DTHGP | 2712 | 2008-05-30 | MBio
| 017DTNGP | 508 | 2007-11-10 | MBio
PLEASE HELP!!!?!?!?!?!?!?
You want embedded queries, which not all SQLs support. In t-sql you'd have something like
select r.registration, r.recent, t.id, t.unittype
from (
select registration, max([date]) recent
from #tmp
group by
registration
) r
left outer join
#tmp t
on r.recent = t.[date]
and r.registration = t.registration
TSQL:
declare #R table
(
Registration varchar(16),
ID int,
Date datetime,
UnitType varchar(16)
)
insert into #R values ('A','1','20090824','A')
insert into #R values ('A','2','20090825','B')
select R.Registration,R.ID,R.UnitType,R.Date from #R R
inner join
(select Registration,Max(Date) as Date from #R group by Registration) M
on R.Registration = M.Registration and R.Date = M.Date
This can be inefficient if you have thousands of rows in your table depending upon how the query is executed (i.e. if it is a rowscan and then a select per row).
In PostgreSQL, and assuming your data is indexed so that a sort isn't needed (or there are so few rows you don't mind a sort):
select distinct on (registration), * from whatever order by registration,"date" desc;
Taking each row in registration and descending date order, you will get the latest date for each registration first. DISTINCT throws away the duplicate registrations that follow.
select registration,ID,date,unittype
from your_table
where (registration, date) IN (select registration,max(date)
from your_table
group by registration)
This should work in MySQL:
SELECT registration, id, date, unittype FROM
(SELECT registration AS temp_reg, MAX(date) as temp_date
FROM table_name GROUP BY registration) AS temp_table
WHERE registration=temp_reg and date=temp_date
The idea is to use a subquery in a FROM clause which throws up a single row containing the correct date and registration (the fields subjected to a group); then use the correct date and registration in a WHERE clause to fetch the other fields of the same row.