SQL Show Dates Per Status - sql

I have a table that looks like so:
id animal_id transfer_date status_from status_to
-----------------------------------------------------------------
100 5265 01-Jul-2016 NULL P
101 5265 22-Jul-2016 P A
102 5265 26-Jul-2016 A B
103 5265 06-Aug-2016 B A
I want to create a view to show me the movement of the animal with start and end dates like the following:
animal_id status start_date end_date
---------------------------------------------------------
5265 NULL NULL 30-Jun-2016
5265 P 01-Jul-2016 21-Jul-2016
5265 A 22-Jul-2016 25-Jul-2016
5265 B 26-Jul-2016 05-Aug-2016
5265 A 06-Aug-2016 SYSDATE OR NULL (current status)
As much as I want to provide a query that I've tried, I have none. I don't even know what to search for.

Something like this may be more efficient than a join. Alas, I didn't see a way to avoid scanning the table twice.
NOTE: I didn't use an ORDER BY clause (and indeed, if I had the ordering would be weird, since I used to_char on the dates to format them). If you need this in further processing, it is best to NOT wrap the dates within to_char.
with
input_data ( id, animal_id, transfer_date, status_from, status_to) as (
select 100, 5265, to_date('01-Jul-2016', 'dd-Mon-yyyy'), null, 'P' from dual union all
select 101, 5265, to_date('22-Jul-2016', 'dd-Mon-yyyy'), 'P' , 'A' from dual union all
select 102, 5265, to_date('26-Jul-2016', 'dd-Mon-yyyy'), 'A' , 'B' from dual union all
select 103, 5265, to_date('06-Aug-2016', 'dd-Mon-yyyy'), 'B' , 'A' from dual
)
select animal_id,
lag (status_to) over (partition by animal_id order by transfer_date) as status,
to_char(lag (transfer_date) over (partition by animal_id order by transfer_date),
'dd-Mon-yyyy') as start_date,
to_char(transfer_date - 1, 'dd-Mon-yyyy') as end_date
from input_data
union all
select animal_id,
max(status_to) keep (dense_rank last order by transfer_date),
to_char(max(transfer_date), 'dd-Mon-yyyy'),
null
from input_data
group by animal_id
;
ANIMAL_ID STATUS START_DATE END_DATE
---------- ------ -------------------- --------------------
5265 30-Jun-2016
5265 P 01-Jul-2016 21-Jul-2016
5265 A 22-Jul-2016 25-Jul-2016
5265 B 26-Jul-2016 05-Aug-2016
5265 A 06-Aug-2016
Added: Explanation of how this works. First, there is a "WITH clause" to create the input data from the OP's message; this is a standard technique, anyone who is not familiar with factored subqueries (CTE, WITH clause) - introduced in Oracle 11.1 - will do themselves (and the rest of us!) a lot of good by reading about it/them.
The query joins together rows from two sources. In one branch, I use the lag() analytic function; it orders rows, within each group by the columns in the "partition by" clause, according to the ordering by the column in the "order by" clause. So for example, the lag(status_to) will look at all the rows within the same animal_id, it will order them by transfer_date, and for each row, it will pick the status_to from the PREVIOUS row (hence "lag"). The rest of that part of the union works similarly.
I have a second part to the union... as you can see in the original post, there are four rows, but the output must have five. In general that suggests a union of some sort will be needed somewhere in the solution (either directly and obviously as in my solution, or via a self-join or in any other way). Here I just another row for the last status (which is still "current"). I use dense_rank last which, within each group (how shown in a GROUP BY), selects just the last row by transfer_date.
To understand how the query works, it may help to, first, comment out the lines union all and select... group by animal_id and run what's left. That will show what the first part of the query does. Then un-comment those lines, and instead comment the first part, from the first select animal_id to union all (comment these two lines and everything in between). Run the query again, this will show just the last row for each animal_id.
Of course, in the sample the OP provided there is only one animal_id; if you like, you can add a few more rows (for example in the WITH clause) with different animal_id. Only now the partition by animal_id and the group by animal_id become important; with only one animal_id they wouldn't be needed (for example, if all the rows are already filtered by WHERE animal_id = 5265 somewhere else in a subquery).
ADDED #2 - the OP has requested one more version of this - what if the first row is not needed? Then the query is much easier to write and to read. Below I won't copy the CTE (WITH clause), and I don't wrap the dates within to_date() anymore. No GROUP BY is needed, and I didn't order the rows (but the OP can do so if needed).
select animal_id,
status_to as status,
transfer_date as start_date,
lead(transfer_date) over (partition by animal_id order by transfer_date) - 1
as end_date
from input_data
;

Related

insert data in tree in oracle

I want to output table from tbl_ledger_input
My code is :
select parent_code ledger_code,
max(name) name,
4 depth,
max(CONCAT(SUBSTR(LEDGER_CODE,1,5),'0000')) PARENT_CODE,
select sum(balance) balance
from tbl_ledger_input
group by eff_date,
ledger_code,
balance,
ref_cur_id,
eff_date,
ref_branch,
cur_balance
order by eff_date,
ref_cur_id,
eff_date,
ref_branch,
sum(cur_balance) cur_balance,
number_date
from tbl_ledger_branch
where depth =5
group by parent_code,ref_cur_id,eff_date,ref_branch,number_date ;
I got this error :
ORA-00936: missing expression
Code you posted is somewhat messy;
select (in the 1st line) should be enclosed into parenthesis
I presume that next 3 lines also belong to it.
You can use a subquery, but it must return at most 1 row - otherwise you'll get too_many_rows error
Also, you can't use order by in there
This is code that might be OK (as far as syntax is concerned):
SELECT parent_code ledger_code,
MAX (name) name,
4 DEPTH,
MAX (CONCAT (SUBSTR (LEDGER_CODE, 1, 5), '0000')) PARENT_CODE,
( SELECT SUM (balance) balance
FROM tbl_ledger_input
GROUP BY eff_date,
ledger_code,
balance,
ref_cur_id,
eff_date,
ref_branch,
cur_balance)
-- order by eff_date , ref_cur_id , eff_date , ref_branch , sum(cur_balance) cur_balance , number_date
FROM tbl_ledger_branch
WHERE DEPTH = 5
GROUP BY parent_code,
ref_cur_id,
eff_date,
ref_branch,
number_date;
but - in my opinion - it is wrong. I doubt that subquery will actually return only one row, so - you'll get an error.
Therefore, use another option. Maybe
you should join tbl_ledger_input and tbl_ledger_branch
or, use queries separately (as subqueries or CTEs) and then merge the result
or, correlate subquery so that it really returns only one row
or something else

Modify my SQL Server query -- returns too many rows sometimes

I need to update the following query so that it only returns one child record (remittance) per parent (claim).
Table Remit_To_Activate contains exactly one date/timestamp per claim, which is what I wanted.
But when I join the full Remittance table to it, since some claims have multiple remittances with the same date/timestamps, the outermost query returns more than 1 row per claim for those claim IDs.
SELECT * FROM REMITTANCE
WHERE BILLED_AMOUNT>0 AND ACTIVE=0
AND REMITTANCE_UUID IN (
SELECT REMITTANCE_UUID FROM Claims_Group2 G2
INNER JOIN Remit_To_Activate t ON (
(t.ClaimID = G2.CLAIM_ID) AND
(t.DATE_OF_LATEST_REGULAR_REMIT = G2.CREATE_DATETIME)
)
where ACTIVE=0 and BILLED_AMOUNT>0
)
I believe the problem would be resolved if I included REMITTANCE_UUID as a column in Remit_To_Activate. That's the REAL issue. This is how I created the Remit_To_Activate table (trying to get the most recent remittance for a claim):
SELECT MAX(create_datetime) as DATE_OF_LATEST_REMIT,
MAX(claim_id) AS ClaimID,
INTO Latest_Remit_To_Activate
FROM Claims_Group2
WHERE BILLED_AMOUNT>0
GROUP BY Claim_ID
ORDER BY Claim_ID
Claims_Group2 contains these fields:
REMITTANCE_UUID,
CLAIM_ID,
BILLED_AMOUNT,
CREATE_DATETIME
Here are the 2 rows that are currently giving me the problem--they're both remitts for the SAME CLAIM, with the SAME TIMESTAMP. I only want one of them in the Remits_To_Activate table, so only ONE remittance will be "activated" per Claim:
enter image description here
You can change your query like this:
SELECT
p.*, latest_remit.DATE_OF_LATEST_REMIT
FROM
Remittance AS p inner join
(SELECT MAX(create_datetime) as DATE_OF_LATEST_REMIT,
claim_id,
FROM Claims_Group2
WHERE BILLED_AMOUNT>0
GROUP BY Claim_ID
ORDER BY Claim_ID) as latest_remit
on latest_remit.claim_id = p.claim_id;
This will give you only one row. Untested (so please run and make changes).
Without having more information on the structure of your database -- especially the structure of Claims_Group2 and REMITTANCE, and the relationship between them, it's not really possible to advise you on how to introduce a remittance UUID into DATE_OF_LATEST_REMIT.
Since you are using SQL Server, however, it is possible to use a window function to introduce a synthetic means to choose among remittances having the same timestamp. For example, it looks like you could approach the problem something like this:
select *
from (
select
r.*,
row_number() over (partition by cg2.claim_id order by cg2.create_datetime desc) as rn
from
remittance r
join claims_group2 cg2
on r.remittance_uuid = cg2.remittance_uuid
where
r.active = 0
and r.billed_amount > 0
and cg2.active = 0
and cg2.billed_amount > 0
) t
where t.rn = 1
Note that that that does not depend on your DATE_OF_LATEST_REMIT table at all, it having been subsumed into the inline view. Note also that this will introduce one extra column into your results, though you could avoid that by enumerating the columns of table remittance in the outer select clause.
It also seems odd to be filtering on two sets of active and billed_amount columns, but that appears to follow from what you were doing in your original queries. In that vein, I urge you to check the results carefully, as lifting the filter conditions on cg2 columns up to the level of the join to remittance yields a result that may return rows that the original query did not (but never more than one per claim_id).
A co-worker offered me this elegant demonstration of a solution. I'd never used "over" or "partition" before. Works great! Thank you John and Gaurasvsa for your input.
if OBJECT_ID('tempdb..#t') is not null
drop table #t
select *, ROW_NUMBER() over (partition by CLAIM_ID order by CLAIM_ID) as ROW_NUM
into #t
from
(
select '2018-08-15 13:07:50.933' as CREATE_DATE, 1 as CLAIM_ID, NEWID() as
REMIT_UUID
union select '2018-08-15 13:07:50.933', 1, NEWID()
union select '2017-12-31 10:00:00.000', 2, NEWID()
) x
select *
from #t
order by CLAIM_ID, ROW_NUM
select CREATE_DATE, MAX(CLAIM_ID), MAX(REMIT_UUID)
from #t
where ROW_NUM = 1
group by CREATE_DATE

How to select 1 date field with 3 tags?

I'm making query that select 1 datefield 3 times.
SELECT t1.datefield date_1,
t1.datefield date_2,
t1.datefield date_3
FROM table t1
WHERE t1.datefield BETWEEN To_date('01/07/2016', 'dd/mm/yyyy') AND To_date('31/07/2016', 'dd/mm/yyyy')
I need date_1 between July 01 and July 31, date_2 between August 01 and
August 31, date_3 between Septembre 01 and September 30. How can I do that?
Example:
SELECT t1.date_invoice date_1,
t1.date_invoice date_2,
t1.date_invoice date_3
FROM invoices t1
WHERE t1.date_invoice BETWEEN To_date('01/07/2016', 'dd/mm/yyyy') AND To_date('31/07/2016', 'dd/mm/yyyy')
DATE_1 DATE_2 DATE_3
---------- ----------- -----------
01/07/2016 14/08/2016 15/09/2016
But table only has one date_invoice.
The problem is that you select THE SAME DATE in your query, even though you give it three different names. When you select from a single table, you get its rows, one by one. For each row in the table, you read the date_invoice value and you insert that one value in all three columns of your result. Then you force three conditions on this value, and the conditions are (obviously) contradicting each other, so nothing will match.
What you probably want to do is to look at three rows from your base table simultaneously - and check if the first one has a date in July, the second in August and the third in September. You may still have no results, or exactly one, or perhaps several results (if there were three invoices in July, one in August and two in September, you will get a total of 3*1*2 = 6 results).
Whenever you need to choose a row from one table AND a row from another table, that is a JOIN. In your case, you want three rows, so it's two joins. And your table is the same all three times; when you join a table to itself, that's called a "self-join" and it may sound weird the first time you see it, but it's quite common and no different from joins between different tables.
Now, perhaps you have an order_id somewhere, and you are looking only for triples of invoices for the same order_id. Then you would have a JOIN CONDITION - that the three rows all have the same order_id. Or perhaps you will first filter the base table to only look at the rows that have a given order_id in the first place.
Anyway, the way you asked the question there is no "join condition" - all invoices must be considered and matched against each other. That is called a CROSS JOIN or CARTESIAN JOIN and it is quite useful in some situations; however, often you will see such cross joins that in fact reflect an incorrect solution to a problem (such as, missing a join condition on order_id - you will get an invoice for one order in July matched to an invoice for a different order in August).
What you want with the problem the way you asked it is a repeated cross join, like this:
SELECT t1.datefield date_1,
t2.datefield date_2,
t3.datefield date_3
FROM your_table t1 CROSS JOIN your_table t2 CROSS JOIN your_table t3
WHERE t1.datefield BETWEEN To_date('01/07/2016', 'dd/mm/yyyy')
AND To_date('31/07/2016', 'dd/mm/yyyy')
AND t2.datefield BETWEEN .... -- (dates for August)
AND t3.datefield BETWEEN .... -- (dates for September)

Need to arrange employee names as per their city column wise

I have written a query which extracts the data from different columns group by city name.
My query is as follows:
select q.first_name
from (select employee_id as eid,first_name,city
from employees
group by city,first_name,employee_id
order by first_name)q
, employees e
where e.employee_id = q.eid;
The output of the query is employee names in a single column grouped by their cities.
Now I would like to enhance the above query to classify the employees by their city names in different columns.
I tried using pivot to make this work. Here is my pivot query:
select * from (
select q.first_name
from (select employee_id as eid,first_name,city
from employees
group by city,first_name,employee_id
order by first_name)q
, employees e
where e.employee_id = q.eid
) pivot
(for city in (select city from employees))
I get some syntax issue saying missing expression and I am not sure how to use pivot to achieve the below expected output.
Expected Output:
DFW CH NY
---- --- ---
TripeH John Hitman
Batista Cena Yokozuna
Rock James Mysterio
Appreciate if anyone can guide me in the right direction.
Unfortunately what you are trying to do is not possible, at least not in "straight" SQL - you would need dynamic SQL, or a two-step process (in the first step generating a string that is a new SQL statement). Complicated.
The problem is that you are not including a fixed list of city names (as string literals). You are trying to create columns based on whatever you get from (select city from employees). Thus the number of columns and the name of the columns is not known until the Oracle engine reads the data from the table, but before the engine starts it must already know what all the columns will be. Contradiction.
Note also that if this was possible, you almost surely would want (select distinct city from employees).
ADDED: The OP asks a follow-up question in a comment (see below).
The ideal arrangement is for the cities to be in their own, smaller table, and the "city" in the employees table to have a foreign key constraint so that the "city" thing is manageable. You don't want one HR clerk to enter New York, another to enter New York City and a third to enter NYC for the same city. One way or the other, first try your code by replacing the subquery that follows the operator IN in the pivot clause with simply the comma-separated list of string literals for the cities: ... IN ('DFW', 'CH', 'NY'). Note that the order in which you put them in this list will be the order of the columns in the output. I didn't check the entire query to see if there are any other issues; try this and let us know what happens.
Good luck!
select
(CASE WHEN CITY="DFW" THEN EMPLOYEE_NAME END) DFW,
(CASE WHEN CITY="CH" THEN EMPLOYEE_NAME END) CH,
(CASE WHEN CITY="NY" THEN EMPLOYEE_NAME END) NY
FROM employees
order by first_name
Maybe you need to transpose your result. See this link . I think DECODE or CASE works best for your case:
select
(CASE WHEN CITY="DFW" THEN EMPLOYEE_NAME END) DFW,
(CASE WHEN CITY="CH" THEN EMPLOYEE_NAME END) CH,
(CASE WHEN CITY="NY" THEN EMPLOYEE_NAME END) NY
FROM employees
order by first_name
Normally I would "edit" my first answer, but the question has changed so much, it's quite different from the original one so my older answer can't be "edited" - this now needs a completely new answer.
You can do what you want with pivoting, as I show below. Wondering why you want to do this in basic SQL and not by using reporting tools, which are written specifically for reporting needs. There's no way you need to keep your data in the pivoted format in the database.
You will see 'York' twice in the Chicago column; you will recognize that's on purpose (you will see I had a duplicate row in the "test" table at the top of my code); this is to demonstrate a possible defect of your arrangement.
Before you ask if you could get the list but without the row numbers - first, if you are simply generating a set of rows, those are not ordered. If you want things ordered for reporting purposes, you can do what I did, and then select "'DFW'", "'CHI'", "'NY'" from the query I wrote. Relational theory and the SQL standard do not guarantee the row order will be preserved, but Oracle apparently does preserve it, at least in current versions; you can use that solution at your own risk.
max(name) in the pivot clause may look odd to the uninitiated; one of the weird limitations of the PIVOT operator in Oracle is that it requires an aggregate function to be used, even if it's over a set of exactly one element.
Here's the code:
with t (city, name) as -- setting up input data for testing
(
select 'DFW', 'Smith' from dual union all
select 'CHI', 'York' from dual union all
select 'DFW', 'Matsumoto' from dual union all
select 'NY', 'Abu Osman' from dual union all
select 'DFW', 'Adams' from dual union all
select 'CHI', 'Wilson' from dual union all
select 'CHI', 'Arenas' from dual union all
select 'NY', 'Theodore' from dual union all
select 'CHI', 'McGhee' from dual union all
select 'NY', 'Zhou' from dual union all
select 'NY' , 'Simpson' from dual union all
select 'CHI', 'Narayanan' from dual union all
select 'CHI', 'York' from dual union all
select 'NY', 'Perez' from dual
)
select * from
(
select row_number() over (partition by city order by name) rn,
city, name
from t
)
pivot (max(name) for city in ('DFW', 'CHI', 'NY') )
order by rn
/
And the output:
RN 'DFW' 'CHI' 'NY'
---------- --------- --------- ---------
1 Adams Arenas Abu Osman
2 Matsumoto McGhee Perez
3 Smith Narayanan Simpson
4 Wilson Theodore
5 York Zhou
6 York
6 rows selected.

SQL using Count, with same "Like" multiple times in same cell

I'm trying to get a count on how many times BNxxxx has been commented in the comments cell. So far, I can make each cell be counted once, but there may be multiple comments in a cell containing BNxxxx.
For example, this:
-------
BN0012
-------
BN0012
-------
BN0012
BN0123
-------
should show an output of BN0012 3 times and BN0123 once. Instead, I get BN0012 3 times only.
Here's my code:
select COMMENTS, count(*) as TOTAL
from NOTE
Where COMMENTS like '%BN%' AND CREATE_DATE between '01/1/2015' AND '11/03/2015'
group by COMMENTS
order by Total desc;
Any ideas?
edit
My code now looks like
select BRIDGE_NO, count(*)
from IACD_ASSET b join
IACD_NOTE c
on c.COMMENTS like concat(concat('BN',b.BRIDGE_NO),'%')
Where c.CREATE_DATE between '01/1/2015' AND '11/03/2015' AND length(b.BRIDGE_NO) > 1
group by b.BRIDGE_NO
order by count(*);
Problem with this is the BN44 is the same as BN4455 .. have tried concat(concat('BN',b.BRIDGE_NO),'_') comes back with nothing , any ideas how i can get exact likes
You have a problem. Let me assume that you have a table of all known BN values that you care about. Then you can do something like:
select bn.fullbn, count(*)
from tableBN bn join
comments c
on c.comment like ('%' || bn.fullbn || '%')
group by bn.fullbn;
The performance of this might be quite slow.
If you happen to be storing lists of things in the comment field, then this is a very bad idea. You should not store lists in strings; you should use a junction table.
I'm going to assume that your COMMENTS table has a primary key column (such as comment_id) or at least that comments isn't a CLOB. If it is a CLOB then you're not going to be able to use GROUP BY on that column.
You can accomplish this as follows without even a lookup table of BN.... values. No guarantees as to the performance:
WITH d1 AS (
SELECT 1 AS comment_id, 'BN0123 is a terrible thing BN0121 also BN0000' AS comments
, date'2015-01-03' AS create_date
FROM dual
UNION ALL
SELECT 2 AS comment_id, 'BN0125 is a terrible thing BN0120 also BN1000' AS comments
, date'2015-02-03' AS create_date
FROM dual
)
SELECT comment_id, comments, COUNT(*) AS total FROM (
SELECT comment_id, comments, TRIM(REGEXP_SUBSTR(comments, '(^|\s)BN\d+(\s|$)', 1, LEVEL, 'i')) AS bn
FROM d1
WHERE create_date >= date'2015-01-01'
AND create_date < date'2015-11-04'
CONNECT BY REGEXP_SUBSTR(comments, '(^|\s)BN\d+(\s|$)', 1, LEVEL, 'i') IS NOT NULL
AND PRIOR comment_id = comment_id
AND PRIOR DBMS_RANDOM.VALUE IS NOT NULL
) GROUP BY comment_id, comments;
Note that I corrected your filter:
CREATE_DATE between '01/1/2015' AND '11/03/2015'
First, you should be using ANSI date literals (e.g., date'2015-01-01'); second, using BETWEEN for dates is often a bad idea as Oracle DATE values contain a time portion. So this should be rewritten as:
create_date >= date'2015-01-01'
AND create_date < date'2015-11-04'
Note that the later date is November 4, to make sure we capture all possible comments that were made on November 3.
If you want to see the matched comments without aggregating the counts, then do the following (taking out the outer query, basically):
WITH d1 AS (
SELECT 1 AS comment_id, 'BN0123 is a terrible thing BN0121 also BN0000' AS comments
, date'2015-01-03' AS create_date
FROM dual
UNION ALL
SELECT 2 AS comment_id, 'BN0125 is a terrible thing BN0120 also BN1000' AS comments
, date'2015-02-03' AS create_date
FROM dual
)
SELECT comment_id, comments, TRIM(REGEXP_SUBSTR(comments, '(^|\s)BN\d+(\s|$)', 1, LEVEL, 'i')) AS bn
FROM d1
WHERE create_date >= date'2015-01-01'
AND create_date < date'2015-11-04'
CONNECT BY REGEXP_SUBSTR(comments, '(^|\s)BN\d+(\s|$)', 1, LEVEL, 'i') IS NOT NULL
AND PRIOR comment_id = comment_id
AND PRIOR DBMS_RANDOM.VALUE IS NOT NULL;
Given the edits to your question, I think you want something like the following:
SELECT b.bridge_no, COUNT(*) AS comment_cnt
FROM iacd_asset b INNER JOIN iacd_note c
ON REGEXP_LIKE(c.comments, '(^|\W)BN' || b.bridge_no || '(\W|$)', 'i')
WHERE c.create_dt >= date'2015-01-01'
AND c.create_dt < date'2015-03-12' -- It just struck me that your dates are dd/mm/yyyy
AND length(b.bridge_no) > 1
GROUP BY b.bridge_no
ORDER BY comment_cnt;
Note that I am using \W in the regex above instead of \s as I did earlier to make sure that it captures things like BN1234/BN6547.
Try use the distinct keyword in your select statement, to pull in unique values for the comments. Like this:
select distinct COMMENTS, count(*) as TOTAL
from NOTE
Where COMMENTS like '%BN%' AND CREATE_DATE between '01/1/2015' AND
'11/03/2015'
group by COMMENTS
order by Total desc;