SQL Server - How to delete some rows of columns without disrupting the rest of the record - sql

I have it
-- -- -- --
01 A1 B1 99
01 A1 B1 98
02 A2 B2 97
02 A2 B2 96
I need this
-- -- -- --
01 A1 B1 99
98
02 A2 B2 97
96
------------
I can not repeat the data that I will present in a excel,
My result needs to be just so.
In my actual table, the last column are responses of forms and the first columns (those that can not repeat) are customer data as (phone, name ...).
The end result of this "query" will populate a "DataTable" and will be presented in a file "xlsx".
Thanks for sharing knowledge ^^

If you have SQL2012+
SELECT
ISNULL(NULLIF(Column1,LAG(Column1) OVER(ORDER BY Column1)),'')
,ISNULL(NULLIF(Column2,LAG(Column2) OVER(ORDER BY Column1,Column2)),'')
,ISNULL(NULLIF(Column3,LAG(Column3) OVER(ORDER BY Column1,Column2,Column3)),'')
,Column4
FROM #mytable
ORDER BY Column1,Column2,Column3,Column4 DESC

It's a little messy, but you can do it in the database. You basically make a subquery that gets the smallest value, and then join that to the regular table and blank out values that don't match. I created your sample set like this:
CREATE TABLE mytable (N1 VARCHAR(2), A VARCHAR(2), B VARCHAR(2), N2 VARCHAR(2))
INSERT INTO mytable VALUES
('01', 'A1', 'B1', '99'),
('01', 'A1', 'B1', '98'),
('02', 'A2', 'B2', '97'),
('02', 'A2', 'B2', '96')
And then was able to get the result like this:
SELECT
CASE WHEN O.N2 = I.N2 THEN O.N1 ELSE '' END,
CASE WHEN O.N2 = I.N2 THEN O.A ELSE '' END,
CASE WHEN O.N2 = I.N2 THEN O.B ELSE '' END,
O.N2
FROM
(SELECT MAX(N2) AS N2, N1, A, B FROM mytable GROUP BY N1, A, B) I
INNER JOIN mytable O
ON O.A = I.A AND O.B = I.B AND O.N1 = I.N1
ORDER BY O.N1 ASC

we can use ROW_NUMBER to get the sequence and substitute '' for all rows where sequence is greater than 1
with CTE
AS
( SELECT ID, ColumnA, ColumnB, value,ROW_NUMBER() over ( PARTITION by id order by id) as seq
FROM tableA
)
, CTE1
AS
(
select id, ColumnA, ColumnB, value, seq from CTE where seq =1
UNION
SELECT id ,'','', value , seq from CTE where seq >1
)
SELECT case when seq >1 THEN NULL ELSE id END as id, columnA, columnB, value from CTE1

You can achieve what you want using a query.
You haven't provided DDL so I am going to asume your columns are called a, b, c and d respectively
; WITH cte AS (
SELECT a
, b
, c
, d
, Row_Number() OVER (PARTITION BY a, b, c ORDER BY d) As sequence
FROM your_table
)
SELECT CASE WHEN sequence = 1 THEN a ELSE '' END As a
, CASE WHEN sequence = 1 THEN b ELSE '' END As b
, CASE WHEN sequence = 1 THEN c ELSE '' END As c
, d
FROM cte
ORDER
BY a
, b
, c
, d
The idea is to assign an incremental counter to each row, that restarts after each change of a + b + c.
We then use a conditional statement to show a value or not (basically only show on the first instance of each group)

The analytic ROW_NUMBER() function is good for this. I've made up column names because you didn't supply any. To assign a row number by customer, use something like this:
SELECT
Name,
Phone,
Address,
Response,
ROW_NUMBER() OVER (PARTITION BY Name, Phone, Address ORDER BY Response) AS CustRow
FROM myTable
That will assign row number within each customer. Try it yourself and I think it will make sense.
You can put it into a subquery or CTE from there and only show customer ID information like name, phone, and address when you're on the first row for each customer:
SELECT
CASE WHEN CustRow = 1 THEN Name ELSE '' END AS Name,
CASE WHEN CustRow = 1 THEN Phone ELSE '' END AS Phone,
CASE WHEN CustRow = 1 THEN Address ELSE '' END AS Address,
Response
FROM (
SELECT
Name,
Phone,
Address,
Response,
ROW_NUMBER() OVER (PARTITION BY Name, Phone, Address ORDER BY Response) AS CustRow
FROM myTable) custSubquery
ORDER BY Name, Phone, Address
The custSubquery on the second-to-last line is because SQL Server requires all subqueries to be aliased, even if the alias isn't used.
The most important thing is to determine how your last column will be ordered for display and to make sure that it's consistent in the ROW_NUMBER() function as well as the final ORDER BY.
If you need more help, please supply table and column names, and specify how results are ordered within each customer.

Related

How to write a BigQuery query that produces the count of the unique transactions and the combination of column names populated

I’m trying to write a query in BigQuery that produces the count of the unique transactions and the combination of column names populated.
I have a table:
TRAN CODE
Full Name
Given Name
Surname
DOB
Phone
The result set I’m after is:
TRAN CODE
UNIQUE TRANSACTIONS
NAME OF POPULATED COLUMNS
A
3
Full Name
A
4
Full Name,Phone
B
5
Given Name,Surname
B
10
Given Name,Surname,DOB,Phone
The result set shows that for TRAN CODE A
3 distinct customers provided Full Name
4 distinct customers provided Full Name and Phone #
For TRAN CODE B
5 distinct customers provided Given Name and Surname
10 distinct customers provided Given Name, Surname, DOB, Phone #
Currently to produce my results I’m doing it manually.
I tried using ARRAY_AGG but couldn’t get it working.
Any advice work be appreciated.
Thank you.
I think you want something like this:
select tran_code,
array_to_string(array[case when full_name is not null then 'full_name' end,
case when given_name is not null then 'given_name' end,
case when surname is not null then 'surname' end,
case when dob is not null then 'dob' end,
case when phone is not null then 'phone' end
], ','),
count(*)
from t
group by 1, 2
Consider below approach - no any dependency on column names rather than TRAN_CODE - quite generic!
select TRAN_CODE,
count(distinct POPULATED_VALUES) as UNIQUE_TRANSACTIONS,
POPULATED_COLUMNS
from (
select TRAN_CODE,
( select as struct
string_agg(col, ', ' order by offset) POPULATED_COLUMNS,
string_agg(val order by offset) POPULATED_VALUES,
string_agg(cast(offset as string) order by offset) pos
from unnest(regexp_extract_all(to_json_string(t), r'"([^"]+?)":')) col with offset
join unnest(regexp_extract_all(to_json_string(t), r'"[^"]+?":("[^"]+?"|null)')) val with offset
using(offset)
where val != 'null'
and col != 'TRAN_CODE'
).*
from `project.dataset.table` t
)
group by TRAN_CODE, POPULATED_COLUMNS
order by TRAN_CODE, any_value(pos)
below is output example
#Gordon_Linoff's solution is the best, but an alternative would be to do the following:
SELECT
TRAN_CODE,
COUNT(TRAN_ROW) AS unique_transactions,
populated_columns
FROM (
SELECT
TRAN_CODE,
TRAN_ROW,
# COUNT(value) AS unique_transactions,
STRING_AGG(field, ",") AS populated_columns
FROM (
SELECT
* EXCEPT(DOB),
CAST(DOB AS STRING ) AS DOB,
ROW_NUMBER() OVER () AS TRAN_ROW
FROM
sample) UNPIVOT(value FOR field IN (Full_name,
Given_name,
Surname,
DOB,
Phone))
GROUP BY
TRAN_CODE,
TRAN_ROW )
GROUP BY
TRAN_CODE,
populated_columns
But this should be more expensive...

How to compare two adjacent rows in SQL?

In SQL is there a way to compare two adjacent rows. In other words if C2 = BEM and C3 = Compliance or if C4 = Compliance and C5 = BEM, then return true. But consecutive rows are identical like in C6 = BEM and C7 = BEM, then return fail.
Check out the lead() and lag() functions.
They do work best (most reliable) with a sorting field... Your sample does not appear to contain such a field. I added a sorting field in my second solution.
The coalesce() function handles the first row that does not have a preceeding row.
Solution 1 without sort field
create table data
(
A nvarchar(10)
);
insert into data (A) values
('BEM'),
('Compliance'),
('BEM'),
('Compliance'),
('BEM'),
('Compliance'),
('Compliance'),
('Compliance');
select d.A,
coalesce(lag(d.A) over(order by (select null)), '') as lag_A,
case
when d.A <> coalesce(lag(d.A) over(order by (select null)), '')
then 'Ok'
else 'Fail'
end as B
from data d;
Solution 2 with sort field
create table data2
(
Sort int,
A nvarchar(10)
);
insert into data2 (Sort, A) values
(1, 'BEM'),
(2, 'Compliance'),
(3, 'BEM'),
(4, 'Compliance'),
(5, 'BEM'),
(6, 'Compliance'),
(7, 'Compliance'),
(8, 'Compliance');
select d.A,
case
when d.A <> coalesce(lag(d.A) over(order by d.Sort), '')
then 'Ok'
else 'Fail'
end as B
from data2 d
order by d.Sort;
Fiddle with results.
As a starter: a SQL table represents an unordered set of rows; there is no inherent ordering. Assuming that you have a column that defines the ordering of the rows, say id, and that your values are stored in column col, you can use lead() and a case expression as follows:
select col,
case when col = lead(col, 1, col) over(order by id)
then 'Fail' else 'OK'
end as status
from mytable t
Assuming you have some sort of column that you can use to determine the row order then you can use the LEAD window function to get the next value.
SELECT
[A],
CASE
WHEN [A] = LEAD([A], 1, [A]) OVER (ORDER BY SomeSortIndex) THEN 'Fail'
ELSE 'Ok'
END [B]
FROM src
The additional parameters in the LEAD function specify the row offset and default value in case there is no additional row. By using the current value as the default it will cause the condition to be true and display Fail like the last result in your example.

SQL LEFT() not working as expected when used with GROUP BY and Partition

I have codes that are like 1231231A, 1231231A, 3453454B etc
I need to group them by their number (ignoring the char which is a version) and just get one of each. I also need to drop the last char. My code works in grouping them and returning one of each, but it returns the last char.
Why is it returning the last char when i chop it off?
Expected output is
1231231
3453454
What I'm getting is
1231231A
3453454B
SELECT * FROM (
SELECT *, ROW_NUMBER() OVER(PARTITION BY T.fldProductDescrip
ORDER BY T.fldEffectiveDate DESC) AS rn
FROM (
-- Insert statements for procedure here
SELECT JST.flduid
,JST.fldEffectiveDate
,(CASE
WHEN RIGHT(fldProductDescrip, 1) LIKE '[0-9]'
THEN fldProductDescrip
ELSE LEFT(fldProductDescrip, DATALENGTH(fldProductDescrip) - 1)
END) as fldProductDescrip
,(
CASE
WHEN PE.fldLogoutDateTime IS NULL
THEN PE.fldESigUser
ELSE ''
END
) AS LoggedIn
,(
CASE
WHEN PE.fldLogoutDateTime IS NULL
THEN PE.fldLoginDateTime
ELSE ''
END
) AS LoggedInDateTime
FROM tblJSJobSheetTemplates JST
INNER JOIN tblJSProducts JP ON JST.fldProductUID = JP.fldUID
INNER JOIN tblJSProductEsig PE ON JP.fldProductDescrip = PE.fldProduct
) AS T
WHERE LoggedIn <> ''
)AS G WHERE rn = 1

extract the second found matching substring using Postgresql

I use the bellow query to extract a value from a column that stores JSON objects.
The issue with it, it does only pull the first value matching to the regex inside SUBSTRING which is -$4,000.00, is there's a parameter to pass to the SUBSTRING to pull the value -$1,990.00 as well in another column.
SELECT attribute_actions_text
, SUBSTRING(attribute_actions_text FROM '"Member [Dd]iscount:":"(.+?)"') AS column_1
, '' AS column_2
FROM (
VALUES
('[{"Member Discount:":"-$4,000.00"},{"Member discount:":"-$1,990.00"}]')
, (NULL)
) ls(attribute_actions_text)
Desired result :
column_1 column_2
-$4,000.00 -$1,990.00
Try this
WITH data(id,attribute_actions_text) as (
VALUES
(1,'[{"Member Discount:":"-$4,000.00"},{"Member Discount:":"-$1,990.00"}]')
, (2,'[{"Member Discount:":"-$4,200.00"},{"Member Discount:":"-$1,890.00"}]')
, (3,NULL)
), match as (
SELECT
id,
m,
ROW_NUMBER()
OVER (PARTITION BY id) AS r
FROM data, regexp_matches(data.attribute_actions_text, '"Member [Dd]iscount:":"(.+?)"', 'g') AS m
)
SELECT
id
,(select m from match where id = d.id AND r=1) as col1
,(select m from match where id = d.id AND r=2) as col2
FROM data d
Result
1,"{-$4,000.00}","{-$1,990.00}"
2,"{-$4,200.00}","{-$1,890.00}"
3,NULL,NULL

sql select to start with a particular record

Is there any way to write a select record starting with a particular record? Suppose I have an table with following data:
SNO ID ISSUE
----------------------
1 A1 unknown
2 A2 some_issue
3 A1 unknown2
4 B1 some_issue2
5 B3 ISSUE4
6 B1 ISSUE4
Can I write a select to start showing records starting with B1 and then the remaining records? The output should be something like this:
4 B1 some_issue2
6 B1 ISSUE4
1 A1 unknown
2 A2 some_issue
3 A1 unknown2
5 B3 ISSUE4
It doesn't matter if B3 is last, just that B1 should be displayed first.
Couple of different options depending on what you 'know' ahead of time (i.e. the id of the record you want to be first, the sno, etc.):
Union approach:
select 1 as sortOrder, SNO, ID, ISSUE
from tableName
where ID = 'B1'
union all
select 2 as sortOrder, SNO, ID, ISSUE
from tableName
where ID <> 'B1'
order by sortOrder;
Case statement in order by:
select SNO, ID, ISSUE
from tableName
order by case when ID = 'B1' then 1 else 2 end;
You could also consider using temp tables, cte's, etc., but those approaches would likely be less performant...try a couple different approaches in your environment to see which works best.
Assuming you are using MySQL, you could either use IF() in an ORDER BY clause...
SELECT SNO, ID, ISSUE FROM table ORDER BY IF( ID = 'B1', 0, 1 );
... or you could define a function that imposes your sort order...
DELIMITER $$
CREATE FUNCTION my_sort_order( ID VARCHAR(2), EXPECTED VARCHAR(2) )
RETURNS INT
BEGIN
RETURN IF( ID = EXPECTED, 0, 1 );
END$$
DELIMITER ;
SELECT SNO, ID, ISSUE FROM table ORDER BY my_sort_sort( ID, 'B1' );
select * from table1
where id = 'B1'
union all
select * from table1
where id <> 'B1'