Performing Comparisons "down" a table, not across rows - sql

I have a SQL problem that I don't have the vocabulary to explain very well.
Here is a simplified table. My goal is to identify groups where the Tax_IDs are not equal. In this case, the query should return groups 1 and 3.
Group Line_ID Tax_ID
1 1001 11
1 1002 13
2 1003 17
2 1004 17
3 1005 23
3 1006 29
I can easily perform comparisons across rows, however I do not know how to perform comparisons "down" a table (here is really where my vocabulary fails me). I.e. what is the syntax that will cause SQL to compare Tax_ID values within groups?
Any help appreciated,
OB

The simplest way is to use group by with a having clause:
select "group"
from t
group by "group"
having min(tax_id) <> max(tax_id);
You can also phrase the having clause as:
having count(distinct tax_id) > 1;
However, count(distinct) is more expensive than just a min() or max()operation.

Related

How to update a target table with Merge statement

I'm trying to update a column in target table using merge statement by joining target table with source tables and getting the following error.
I need to compare a offer_id, order_Date and Doc_receipt_date from TRADE table with offer_trade_by_date, offer_start_date, offer_end_date. Here I'm trying to verify the trade is done with in the time period. If TRADE is done in time then it's pass the check (i.e 'Y'). If TRADE is not done in time then it didn't pass the check (i.e 'N'). If we don't have any information to check the condition (i.e when DOCK_RECEIPT_DATE is NULL) then ('X'). To achieve this check I wrote below code and getting the following error.
ORA: 30926 unable to get a stable set of rows in the source table.
Check the data in my tables below.
TRADE / Target table
KEYID DPBL_OFFER ORD_DATE DOC_RECPT_DT TRADE_DATE_MET
1 107 30-SEP-17 01-JAN-17 X
2 107 22-SEP-17 NULL X
3 107 07-OCT-17 NULL X
4 107 24-NOV-17 28-NOV-17 X
5 106 24-AUG-17 11-SEP-17 X
6 105 11-JUN-17 NULL X
7 108 05-SEP-17 13-SEP-17 X
8 109 28-JUL-17 10-AUG-17 X
9 110 01-SEP-17 14-SEP-17 X
PROD_OFFER /Source table)
Offer_id Trade_by_Date
106 14-OCT-17
107 14-NOV-17
105 02-AUG-17
108 18-NOV-17
109 14-OCT-17
110 18-NOV-17
OFFER_START_END_V /Source Table 2)
Offer_id Offer_Period Offer_Start_Date Offer_End_Date
106 1 27-JUL-17 27-JUL-17
106 2 28-JUL-17 14-OCT-17
107 1 15-SEP-17 23-JAN-18
105 1 01-JUN-17 02-AUG-17
108 1 23-AUG-17 14-SEP-17
108 2 16-SEP-17 19-SEP-17
110 1 23-AUG-17 14-SEP-17
110 2 16-SEP-17 19-SEP-17
109 1 02-JUL-17 12-NOV-17
Here keyid in my target table is PK and DPBL_OFFER id is offer_id from target table and isn't FK.
Check below code
MERGE INTO TRADE TB
USING (
SELECT T1.KEYID, T1.DPBL_OFFER
, CASE WHEN T1.ORD_DATE >= T3.OFFER_START_DATE AND
T1.ORD_DATE <= T2.TRADE_BY_DATE AND
T1.COD_RECPT_DATE <= T3.OFFER_END_DATE
THEN 'Y'
WHEN T1.ORD_DATE < T3.OFFER_START_DATE AND
T1.ORD_DATE > T2.TRADE_BY_DATE AND
T1.COD_RECPT_DATE > T3.OFFER_END_DATE
THEN 'N'
ELSE 'X'
END AS TRADE_DATE_MET
FROM TRADE T1
JOIN PROD_OFFER T2
ON T1.DPBL_OFFER_ID = T2.OFFER_ID
JOIN OFFER_START_END_V T3
ON T1.DPBL_OFFER_ID = T3.OFFER_ID) JT
ON TB.KEYID = JT.KEYID
AND TB.DPBL_OFFER_ID = JT.OFFER_ID
WHEN MATCH THEN
UPDATE SET TB. TRADE_DATE_MET = JT.TRADE_DATE_MET;
Can some one help me to overcome this error.?
FYI:- I'm using Oracle 11g.
That error generally means that there is at least one row in the target table (at least one, there may be many) for which there are at least TWO different rows in the source table (or result of three-table join, in your case) satisfying the ON condition(s) in the MERGE statement - and for which the values used in the UPDATE clause are actually different.
In your case: for KEYID = 5, DPBL_OFFER is 106. This joins to one row in PROD_OFFER and two different rows in OFFER_START_END_V. And the TRADE_END_MET is different for the two resulting rows in the three-table join. (Or, if it is the same - let's say 'N' for both - for this KEYID, then perhaps for KEYID= 7, with DPBL_OFFER = 108, which also joins to two different rows in the last table, the resulting TRADE_END_MET is not the same in both rows.)
This kind of error is usually fatal, since it is in fact an error in logic, regardless of code (or even of language). That is, even if you express the problem in common language and you try to solve it with paper and pencil, you can't, because it is self contradictory.
It is similar to, but more complicated than: Target table has columns ID (primary key) and FLAG (currently null). Source has columns ID and flag. Target has one row (1, null). Source has two rows, (1, 'Y') and (1, 'N'). You want to use the source to update the flag in the target. Do you see why that doesn't make sense? This is exactly the kind of problem you are having.
Run the three-way join by itself (the "source table" for the MERGE) and inspect the TRADE_END_MET values for KEYID = 5 and 7 - you will likely find the problem.

Select query to fetch required data from SQL table

I have some data like this as shown below:
Acc_Id || Row_No
1 1
2 1
2 2
2 3
3 1
3 2
3 3
3 4
and I need a query to get the results as shown below:
Acc_Id || Row_No
1 1
2 3
3 4
Please consider that I'm a beginner in SQL.
I assume you want the Count of the row
SELECT Acc_Id, COUNT(*)
FROM Table
GROUP BY Acc_Id
Try this:
select Acc_Id, MAX(Row_No)
from table
group by Acc_Id
As a beginner then this is your first exposure to aggregation and grouping. You may want to look at the documentation on group by now that this problem has motivated your interest in a solutions. Grouping operates by looking at rows with common column values, that you specify, and collapsing them into a single row which represents the group. In your case values in Acc_Id are the names for your groups.
The other answers are both correct in the the final two columns are going to be equivalent with your data.
select Acc_Id, count(*), max(Row_No)
from T
group by Acc_Id;
If you have gaps in the numbering then they won't be the same. You'll have to decide whether you're actually looking for a count of rows of a maximum of a value within a column. At this point you can also consider a number of other aggregate functions that will be useful to you in the future. (Note that the actual values here are pretty much meaningless in this context.)
select Acc_Id, min(Row_No), sum(Row_No), avg(Row_No)
from T
group by Acc_Id;

Selecting filtered values from Oracle using ROWNUM

I have a requirement wherein i need to find the record number of the records that are returned from the resultset. I know that i can use ROWNUM to get the record number from the resultset but my issue is slightly different. below are the details
Table : ProcessSummary
Columns:
PS_PK ProcessId StepId AsscoiateId ProcessName AssetAmount
145 25 50 Process1 3,500.00
267 26 45 Process2 4,400.00
356 27 70 Process3 2,400.00
456 28 80 90 Process4 780.00
556 29 56 67 Process5 4,500.00
656 45 70 Process6 6,000.00
789 31 75 Process7 8,000.00
Now what i need to do is fetch all the records from the ProcessSummary Table when either of ProcessId OR StepId OR AssociateId is NULL. I wrote the below query
select * from ProcessSummary where ProcessId IS NULL OR StepId IS NULL OR AsscoiateId IS NULL
As expected i got 1st, 2nd, 3rd, 6th and 7th records in the resultset that got returned.
Now what i need is to get the records numbers 1,2,3,6,7. I tried to use the ROWNUM as below but i got the values of 1,2,3,4,5 and not 1,2,3,6,7.
select ROWNUM from ProcessSummary where ProcessId IS NULL OR StepId IS NULL OR AsscoiateId IS NULL
Is it possible to get the ROWNUM values in the sequence that i want and if yes then can you please let me know how can i do this. Also if ROWNUM cannot be used then what would be the other option that i can use to get the result in the form that i want.
Any help would be greately appericiated as i could not find much on the net or SO regarding this sort of requirement.
Thanks
Vikeng21
rownum is an internal numbering that gives you a row number based on the current query results only, so that numbering is not tied to a specific record, and it will change when you change the data or the query.
But the numbering you ask for is already in your table. It looks like you just need to SELECT PS_PK .. instead. PS_PK is the field in your table that contains the actual number you want.
You can generate a numbering using an analytical function, and then filter that query. You need some fields to order by, though. In this case I've chosen PS_PK, but it can be another field, like ProcessName or a combination of other fields as well.
select
*
from
(select
dense_rank() over (order by PS_PK) as RANKING,
p.*
from
ProcessSummary p)
where
ProcessId IS NULL OR StepId IS NULL OR AsscoiateId IS NULL
So, in this query, first a numbering is calculated for each row that is returned from the inner query. The numbering is returned as the field RANKING. And then the other query filters further, but still will return the field RANKING with the original numbering.
Instead of dense_rank there is also rank and row_number. The differences are subtle, but you can just experiment and read some docs here and here to learn about the differences and see which one fits you best.
Note that this might slow down your query, because the inner query first generates a number for each row in the table (there is no filtering on that level now).

How to group by a column

Hi I know how to use the group by clause for sql. I am not sure how to explain this so Ill draw some charts. Here is my original data:
Name Location
----------------------
user1 1
user1 9
user1 3
user2 1
user2 10
user3 97
Here is the output I need
Name Location
----------------------
user1 1
9
3
user2 1
10
user3 97
Is this even possible?
The normal method for this is to handle it in the presentation layer, not the database layer.
Reasons:
The Name field is a property of that data row
If you leave the Name out, how do you know what Location goes with which name?
You are implicitly relying on the order of the data, which in SQL is a very bad practice (since there is no inherent ordering to the returned data)
Any solution will need to involve a cursor or a loop, which is not what SQL is optimized for - it likes working in SETS not on individual rows
Hope this helps
SELECT A.FINAL_NAME, A.LOCATION
FROM (SELECT DISTINCT DECODE((LAG(YT.NAME, 1) OVER(ORDER BY YT.NAME)),
YT.NAME,
NULL,
YT.NAME) AS FINAL_NAME,
YT.NAME,
YT.LOCATION
FROM YOUR_TABLE_7 YT) A
As Jirka correctly pointed out, I was using the Outer select, distinct and raw Name unnecessarily. My mistake was that as I used DISTINCT , I got the resulted sorted like
1 1
2 user2 1
3 user3 97
4 user1 1
5 3
6 9
7 10
I wanted to avoid output like this.
Hence I added the raw id and outer select
However , removing the DISTINCT solves the problem.
Hence only this much is enough
SELECT DECODE((LAG(YT.NAME, 1) OVER(ORDER BY YT.NAME)),
YT.NAME,
NULL,
YT.NAME) AS FINAL_NAME,
YT.LOCATION
FROM SO_BUFFER_TABLE_7 YT
Thanks Jirka
If you're using straight SQL*Plus to make your report (don't laugh, you can do some pretty cool stuff with it), you can do this with the BREAK command:
SQL> break on name
SQL> WITH q AS (
SELECT 'user1' NAME, 1 LOCATION FROM dual
UNION ALL
SELECT 'user1', 9 FROM dual
UNION ALL
SELECT 'user1', 3 FROM dual
UNION ALL
SELECT 'user2', 1 FROM dual
UNION ALL
SELECT 'user2', 10 FROM dual
UNION ALL
SELECT 'user3', 97 FROM dual
)
SELECT NAME,LOCATION
FROM q
ORDER BY name;
NAME LOCATION
----- ----------
user1 1
9
3
user2 1
10
user3 97
6 rows selected.
SQL>
I cannot but agree with the other commenters that this kind of problem does not look like it should ever be solved using SQL, but let us face it anyway.
SELECT
CASE main.name WHERE preceding_id IS NULL THEN main.name ELSE null END,
main.location
FROM mytable main LEFT JOIN mytable preceding
ON main.name = preceding.name AND MIN(preceding.id) < main.id
GROUP BY main.id, main.name, main.location, preceding.name
ORDER BY main.id
The GROUP BY clause is not responsible for the grouping job, at least not directly. In the first approximation, an outer join to the same table (LEFT JOIN below) can be used to determine on which row a particular value occurs for the first time. This is what we are after. This assumes that there are some unique id values that make it possible to arbitrarily order all the records. (The ORDER BY clause does NOT do this; it orders the output, not the input of the whole computation, but it is still necessary to make sure that the output is presented correctly, because the remaining SQL does not imply any particular order of processing.)
As you can see, there is still a GROUP BY clause in the SQL, but with a perhaps unexpected purpose. Its job is to "undo" a side effect of the LEFT JOIN, which is duplication of all main records that have many "preceding" ( = successfully joined) records.
This is quite normal with GROUP BY. The typical effect of a GROUP BY clause is a reduction of the number of records; and impossibility to query or test columns NOT listed in the GROUP BY clause, except through aggregate functions like COUNT, MIN, MAX, or SUM. This is because these columns really represent "groups of values" due to the GROUP BY, not just specific values.
If you are using SQL*Plus, use the BREAK function. In this case, break on NAME.
If you are using another reporting tool, you may be able to compare the "name" field to the previous record and suppress printing when they are equal.
If you use GROUP BY, output rows are sorted according to the GROUP BY columns as if you had an ORDER BY for the same columns. To avoid the overhead of sorting that GROUP BY produces, add ORDER BY NULL:
SELECT a, COUNT(b) FROM test_table GROUP BY a ORDER BY NULL;
Relying on implicit GROUP BY sorting in MySQL 5.6 is deprecated. To achieve a specific sort order of grouped results, it is preferable to use an explicit ORDER BY clause. GROUP BY sorting is a MySQL extension that may change in a future release; for example, to make it possible for the optimizer to order groupings in whatever manner it deems most efficient and to avoid the sorting overhead.
For full information - http://academy.comingweek.com/sql-groupby-clause/
SQL GROUP BY STATEMENT
SQL GROUP BY clause is used in collaboration with the SELECT statement to arrange identical data into groups.
Syntax:
1. SELECT column_nm, aggregate_function(column_nm) FROM table_nm WHERE column_nm operator value GROUP BY column_nm;
Example :
To understand the GROUP BY clauserefer the sample database.Below table showing fields from “order” table:
1. |EMPORD_ID|employee1ID|customerID|shippers_ID|
Below table showing fields from “shipper” table:
1. | shippers_ID| shippers_Name |
Below table showing fields from “table_emp1” table:
1. | employee1ID| first1_nm | last1_nm |
Example :
To find the number of orders sent by each shipper.
1. SELECT shipper.shippers_Name, COUNT (orders.EMPORD_ID) AS No_of_orders FROM orders LEFT JOIN shipper ON orders.shippers_ID = shipper.shippers_ID GROUP BY shippers_Name;
1. | shippers_Name | No_of_orders |
Example :
To use GROUP BY statement on more than one column.
1. SELECT shipper.shippers_Name, table_emp1.last1_nm, COUNT (orders.EMPORD_ID) AS No_of_orders FROM ((orders INNER JOIN shipper ON orders.shippers_ID=shipper.shippers_ID) INNER JOIN table_emp1 ON orders.employee1ID = table_emp1.employee1ID)
2. GROUP BY shippers_Name,last1_nm;
| shippers_Name | last1_nm |No_of_orders |
for more clarification refer my link
http://academy.comingweek.com/sql-groupby-clause/

SQL Sort Order by The Order Specified In the Query

Say I have a query "select * from clauses where id in (0,2,5,1,3)" and I actually want the rows returned in the same order they are specified the where clause. The order of the IDs will change from query to query and there is no pattern to the order.
I know it's possible to alter the data model, create temp tables, etc. But trust me that those types of solutions will not work in my situation. I also am unable to alter the order of the result objects in my application code.
I'm also aware that different database engines sort things differently, there are no guarantees in some situations blah blah blah. I just want to no, is this possible?
I'll be using mysql or sql server if that helps :)
On MySQL, you can use FIND_IN_SET:
ORDER BY FIND_IN_SET(id, '0,2,5,1,3')
The most portable means of ordering would be to use a CASE expression:
ORDER BY CASE id
WHEN 0 THEN 1
WHEN 2 THEN 2
WHEN 5 THEN 3
WHEN 1 THEN 4
WHEN 3 THEN 5
END
Select ..
From Clauses
Where Id In(0,2,5,1,3)
Order By Case Id
When 0 Then 1
When 2 Then 2
When 5 Then 3
When 1 Then 4
When 3 Then 5
...
End