How can I write this using SQL? - sql

I need to write a code in sql that writes "del_row" in the column "Adjustment_name" when there are duplicated Id_numbers (e.g:234566) but just when one of the values in Phone_number start with A and other one start with B and in that case, it will write "del_row" just in the row in which the value in column "Phone_number" starts with "B". Imagine that I have two duplicated id_numbers and in one of them, the Phone_number starts with A and in the other row starts with "C". In this last situation, I don't want to write anything.
Id_number
Phone_number
Adjustment_name
234566
A5258528564
675467
A1147887422
675534
P1554515315
234566
B4141415882
del_row
234566
C5346656665
Many thanks!

One approach
SELECT t.id_number, t.Phone_number,
CASE WHEN a.id_number IS NOT NULL THEN 'del_row' ELSE '' END as Adjustment_name
FROM mytable t
LEFT JOIN
(SELECT id_number from mytable
WHERE SUBSTRING(Phone_number FROM 1 FOR 1)='A') a
/* List of IDs that have a phone number starting with A */
ON a.id_number = t.id_number
AND SUBSTRING(t.Phone_number FROM 1 FOR 1)='B'
/* Only check for matching ID with A if this number starts with B */

A rather crude approach would be as below
(assuming your phones rank Axxx, Bxxx, Cxxx, Dxxx). If your phone numbering logic is different - which is not very clear from your req - you can adjust accordingly.
create table temp_table_1 as (
select id_number, phone_number
, case
when dense_rank() over(partition by id_number order by phone_number)>1
and phone_number like 'B%'
then 'del_row'
end adjustment_name
from your_table_name
) with data;
drop table your_table_name;
rename table temp_table_1 to your_table_name;

Related

Calculate field with value based on select statement

I currently have a select statement which returns Customer Numbers that are primary.
What I would like to do for those returned, I would like to have another column that is for customerRole. For customerRole the value should be either primary or secondary.
My current select statement is bringing those that are primary and based on that select statement. I would like to have a customerRole column that shows these as primary. Then I would like to use this same column with my other select statement to show those that are secondary. When they are ran together I would like to see something like:
accountNumber: 1234455 CustomerRole: Primary
AccountNumber: 3245454 CustomerRole: Secondary
Does anyone know how I can accomplish this? Here is my select to get primary numbers:
SELECT
F.CustomerNumber
FROM ods.CustomerFact F
JOIN ods.holderDim AD
ON F.HolderRowNumber = AD.HolderRowNumber
JOIN ods.holderOwesDim B
ON F.PrimaryHolderNumber = B.SecondaryHolderNumber
I think you want a CASE expression:
SELECT c.CustomerNumber,
(CASE WHEN EXISTS (SELECT 1
FROM ods.holderDim hd
WHERE c.PrimaryHolderNumber = hd.SecondaryHolderNumber
) AND
EXISTS (SELECT 1
FROM ods.holderOwesDim hod
WHERE c.PrimaryHolderNumber = hod.SecondaryHolderNumber
THEN 'Primary' ELSE 'Secondary'
)
END) as role
FROM ods.CustomerFact c;

SQL Oracle - query to return rows based on data matchng rules

I have the below data
NUMBER SEQUENCE_NUMBER
CA00000045 AAD508
CA00000045 AAD508
CA00000046 AAD509
CA00000047 AAD510
CA00000047 AAD510
CA00000047 AAD511
CA00000048 AAD511
and I would like to find out which rows do not match the following rule:
NUMBER will always be the same when the SEQUENCE_NUMBER is the same.
So in the above data 'AAD508' will mean the NUMBER value will be the same on each row where the same value appears in the SEQUENCE_NUMBER.
I want to right a query that will bring me back rows where this rule is
broken. So for example:
CA00000047 AAD511
CA00000048 AAD511
I don't know where to start with this one, so have no initial SQL i'm afraid.
Thanks
You want to self join on the data to compare each row to all others sharing the same sequence number, and then filter using a with statement to only get rows with non-matching numbers. You did not give a name for the table so I added it as "table_name" below
SELECT
a.NUMBER,
a.SEQUENCE_NUMBER
FROM table_name a
INNER JOIN table_name b
ON a.SEQUENCE_NUMBER = b.SEQUENCE_NUMBER
WHERE a.NUMBER <> b.NUMBER
GROUP BY 1,2
Threw in the group by to act as a distinct
I would simply use exists:
select t.*
from t
where exists (select 1
from t t2
where t2.sequence_number = t.sequence_number and
t2.number <> t.number
);
If sequence_numbers() only had up to two rows, you could get each rule-breaker on one row:
select sequence_number, min(number), max(number)
from t
group by sequence_number
having min(number) <> max(number);
Or, you could generalize this to get the list of numbers on a single row:
select sequence_number, listagg(number, ',') within group (order by number) as numbers
from t
group by sequence_number
having min(number) <> max(number);

finding distinct customer_id when the part numbers are not one per row but has delimiter "/"

Have a data set similiar to this.
Customer_id PART_N PART_C TXN_ID
B123 268888 7902/7900 159
B123 12839 82900/8900 1278
B869 12839 8203/890025/7902 17890
B290 268888 62820/12839 179018
not sure how to combine PART_N and PART_C and find count(distinct customer_id) for each part the same part could be in PART_N or PART_C like part number 12839
I am interested in getting as following table using teradata
Part COUNT(Distinct Customer id)
268888 2
12839 3
7902 2
7900 1
82900 1
8900 1
8203 1
890025 1
62820 1
if it was just PART_N then it would be straight forward as just one part number is present per row. Unsure how I combine every part number and find how many distinct customer id each one has. If it helps I have all the list of distinct Part numbers in one table say table2.
I cannot not try this code, so see it as pseudocode and sketch of an idea.
SELECT numbers, COUNT(numbers)
FROM
(SELECT
REGEXP_SPLIT_TO_TABLE( -- B
CONCAT(PART_N, '/', PART_C), -- A
'/'
) as numbers
FROM table) s
GROUP BY numbers -- C
A: Concatenation of both columns into one string divided by the delimiter '/'
B: Split string by delimiter
C: Group string parts and count them
http://www.teradatawiki.net/2014/05/regular-expression-functions.html
This is pretty ugly.
First let's split those delimited strings up, using strtok_split_to_table.
create volatile table vt_split as (
select
txn_id,
token as part
from table
(strtok_split_to_table(your_table.txn_id,your_table.part_c,'/')
returns (txn_id integer,tokennum integer,token varchar(10))) t
)
with data
primary index (txn_id)
on commit preserve rows;
That will give you all those split apart, with the appropriate txn_id.
Then we can union that with the part_n values.
create volatile table vt_merged as (
select * from vt_split
UNION ALL
select
txn_id,
cast(part_n as varchar(10)) as part
from
vt_foo)
with data
primary index (txn_id)
on commit preserve rows;
Finally, we can join that back to your original table to get the counts of customer by part.
select
vt_merged.part,
count (distinct yourtable.customer_id)
from
vt_merged
inner join yourtable
on vt_merged.txn_id = yourtable.txn_id
group by 1
This could probably done a little bit cleaner, but it should get you what you're looking for.
This is #S-Man's pseudocode as working query:
WITH cte AS
(
SELECT Customer_id,
Trim(PART_N) ||'/' || PART_C AS all_parts
FROM tab
)
SELECT
part, -- if part should be numeric: Cast(part AS INT)
Count(DISTINCT Customer_id)
FROM TABLE (StrTok_Split_To_Table(cte.Customer_id, cte.all_parts, '/')
RETURNS (Customer_id VARCHAR(10), tokennum INTEGER, part VARCHAR(30))) AS t
GROUP BY 1

Find incorrect records by Id

I am trying to find records where the personID is associated to the incorrect SoundFile(String). I am trying to search for incorrect records among all personID's, not just one specific one. Here are my example tables:
TASKS-
PersonID SoundFile(String)
123 D10285.18001231234.mp3
123 D10236.18001231234.mp3
123 D10237.18001231234.mp3
123 D10212.18001231234.mp3
123 D12415.18001231234.mp3
**126 D19542.18001231234.mp3
126 D10235.18001234567.mp3
126 D19955.18001234567.mp3
RECORDINGS-
PhoneNumber(Distinct Records)
18001231234
18001234567
So in this example, I am trying to find all records like the one that I indented. The majority of the soundfiles like '%18001231234%' are associated to PersonID 123, but this one record is PersonID 126. I need to find all records where for all distinct numbers from the Recordings table, the PersonID(s) is not the majority.
Let me know if you need more information!
Thanks in advance!!
; WITH distinctRecordings AS (
SELECT DISTINCT PhoneNumber
FROM Recordings
),
PersonCounts as (
SELECT t.PersonID, dr.PhoneNumber, COUNT(*) AS num
FROM
Tasks t
JOIN distinctRecordings dr
ON t.SoundFile LIKE '%' + dr.PhoneNumber + '%'
GROUP BY t.PersonID, dr.PhoneNumber
)
SELECT t.PersonID, t.SoundFile
FROM PersonCounts pc1
JOIN PersonCounts pc2
ON pc2.PhoneNumber = pc1.PhoneNumber
AND pc2.PersonID <> pc1.PersonID
AND pc2.Num < pc1.Num
JOIN Tasks t
ON t.PersonID = pc2.PersonID
AND t.SoundFile LIKE '%' + pc2.PhoneNumber + '%'
SQL Fiddle Here
To summarize what this does... the first CTE, distinctRecordings, is just a distinct list of the Phone Numbers in Recordings.
Next, PersonCounts is a count of phone numbers associated with the records in Tasks for each PersonID.
This is then joined to itself to find any duplicates, and selects whichever duplicate has the smaller count... this is then joined back to Tasks to get the offending soundFile for that person / phone number.
(If your schema had some minor improvements made to it, this query would have been much simpler...)
here you go, receiving all pairs (PersonID, PhoneNumber) where the person has less entries with the given phone number than the person with the maximum entries. note that the query doesn't cater for multiple persons on par within a group.
select agg.pid
, agg.PhoneNumber
from (
select MAX(c) KEEP ( DENSE_RANK FIRST ORDER BY c DESC ) OVER ( PARTITION BY rt.PhoneNumber ) cmax
, rt.PhoneNumber
, rt.PersonID pid
, rt.c
from (
select r.PhoneNumber
, t.PersonID
, count(*) c
from recordings r
inner join tasks t on ( r.PhoneNumber = regexp_replace(t.SoundFile, '^[^.]+\.([^.]+)\.[^.]+$', '\1' ) )
group by r.PhoneNumber
, t.PersonID
) rt
) agg
where agg.c < agg.cmax
;
caveat: the solution is in oracle syntax though the operations should be in the current sql standard (possibly apart from regexp_replace, which might not matter too much since your sound file data seems to follow a fixed-position structure ).

How to select a row for certain (or give preference in the selection) in mysql?

Need your help guys in forming a query.
Example.
Company - Car Rental
Table - Cars
ID NAME STATUS
1 Mercedes Showroom
2 Mercedes On-Road
Now, how do I select only one entry from this table which satisfies the below conditions?
If Mercedes is available in Showroom, then fetch only that row. (i.e. row 1 in above example)
But If none of the Mercedes are available in the showroom, then fetch any one of the rows. (i.e. row 1 or row 2) - (This is just to say that all the mercedes are on-road)
Using distinct ain't helping here as the ID's are also fetched in the select statement
Thanks!
Here's a common way of solving that problem:
SELECT *,
CASE STATUS
WHEN 'Showroom' THEN 0
ELSE 1
END AS InShowRoom
FROM Cars
WHERE NAME = 'Mercedes'
ORDER BY InShowRoom
LIMIT 1
Here's how to get all the cars, which also shows another way to solve the problem:
SELECT ID, NAME, IFNULL(c2.STATUS, c1.STATUS)
FROM Cars c1
LEFT OUTER JOIN Cars c2
ON c2.NAME = c1.NAME AND c2.STATUS = 'Showroom'
GROUP BY NAME
ORDER BY NAME
You would want to use the FIND_IN_SET() function to do that.
SELECT *
FROM Cars
WHERE NAME = 'Mercedes'
ORDER BY FIND_IN_SET(`STATUS`,'Showroom') DESC
LIMIT 1
If you have a preferred order of other statuses, just add them to the second parameter.
ORDER BY FIND_IN_SET(`STATUS`,'On-Road,Showroom' ) DESC
To fetch 'best' status for all cars you can simply do this:
SELECT *
FROM Cars
GROUP BY NAME
ORDER BY FIND_IN_SET(`STATUS`,'Showroom') DESC
SELECT * FROM cars
WHERE name = 'Mercedes'
AND status = 'Showroom'
UNION SELECT * FROM cars
WHERE name = 'Mercedes'
LIMIT 1;
EDIT Removed the ALL on the UNION since we only want distinct rows anyway.
MySQL doesn't have ranking/analytic/windowing functions, but you can use a variable to simulate ROW_NUMBER functionality (when you see "--", it's a comment):
SELECT x.id, x.name, x.status
FROM (SELECT t.id,
t.name,
t.status,
CASE
WHEN #car_name != t.name THEN #rownum := 1 -- reset on diff name
ELSE #rownum := #rownum + 1
END AS rank,
#car_name := t.name -- necessary to set #car_name for the comparison
FROM CARS t
JOIN (SELECT #rownum := NULL, #car_name := '') r
ORDER BY t.name, t.status DESC) x --ORDER BY is necessary for rank value
WHERE x.rank = 1
Ordering by status DESC means that "Showroom" will be at the top of the list, so it'll be ranked as 1. If the car name doesn't have a "Showroom" status, the row ranked as 1 will be whatever status comes after "Showroom". The WHERE clause will only return the first row for each car in the table.
The status being a text based data type tells me your data is not normalized - I could add records with "Showroom", "SHOWroom", and "showROOM". They'd be valid, but you're looking at using functions like LOWER & UPPER when you are grouping things for counting, sum, etc. The use of functions would also render an index on the column useless... You'll want to consider making a CAR_STATUS_TYPE_CODE table, and use a foreign key relationship to make sure bad data doesn't get into your table:
DROP TABLE IF EXISTS `example`.`car_status_type_code`;
CREATE TABLE `example`.`car_status_type_code` (
`car_status_type_code_id` int(10) unsigned NOT NULL auto_increment,
`description` varchar(45) NOT NULL default '',
PRIMARY KEY (`car_status_type_code_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;