Partial group by

Partial group by - sql

QUERY:
select ws_path from workpaths where
(
(ws_path like '%R_%') or
(ws_path like '%PB_%' ) or
(ws_path like '%ST_%')
)
OUTPUT:
/x/eng/users/ST_3609843_ijti4689_3609843_1601272247
/x/eng/users/ST_3610020_zozt5229_3610020_1601282033
/x/eng/users/ST_3611181_zozt5229_3611181_1601282032
/x/eng/users/ST_3611226_zozt5229_3611226_1601282033
/x/eng/users-random/john/N_3582168_3551186_1601040805
/x/eng/users-random/james/N_3582619_3551186_1601041405
/x/eng/users-random/jimmy/N_3582791_3551186_1601042005
/x/eng/users/R_3606462_3606462_1601251334
/x/eng/users/R_3611775_3612090_1601290909
/x/eng/users/R_3612813_3613016_1601292252
Is there way to group partially by ST_, N_ and R_?
i.e. group by ws_path wont work at the moment for the obvious reason
I need to only look at the last item in the path (split by '/') and then the front part of splitting with '_'

You can use regexp_substr to get the patterns being searched for and then group by the number of such occurrences.
select regexp_substr(ws_path,'\/R_|\/PB_|\/ST_'), count(*)
from workpaths
group by regexp_substr(ws_path,'\/R_|\/PB_|\/ST_')

Regexp is a good solution but can be expensive. A simpler substring might be cheaper and faster:
CREATE TABLE tbl (field1 VARCHAR(100));
INSERT INTO dbo.tbl
( field1 )
VALUES
('/x/eng/users/ST_3609843_ijti4689_3609843_1601272247'),
('/x/eng/users/ST_3610020_zozt5229_3610020_1601282033'),
('/x/eng/users/ST_3611181_zozt5229_3611181_1601282032'),
('/x/eng/users/ST_3611226_zozt5229_3611226_1601282033'),
('/x/eng/users-random/john/N_3582168_3551186_1601040805'),
('/x/eng/users-random/james/N_3582619_3551186_1601041405'),
('/x/eng/users-random/jimmy/N_3582791_3551186_1601042005'),
('/x/eng/users/R_3606462_3606462_1601251334'),
('/x/eng/users/R_3611775_3612090_1601290909'),
('/x/eng/users/R_3612813_3613016_1601292252');
SELECT
COUNT(CASE WHEN field1 LIKE '%/ST_%' THEN 1 ELSE NULL END) AS 'st_count',
COUNT(CASE WHEN field1 LIKE '%/N_%' THEN 1 ELSE NULL END) AS 'n_count',
COUNT(CASE WHEN field1 LIKE '%/R_%' THEN 1 ELSE NULL END) AS 'r_count'
FROM dbo.tbl

Related

Coalesce in duplicated values

I have a table like this:
And I want to transform for each value a column, to become something like this:
If I do a query like this:
Select "_sdc_source_key_id",
COALESCE(value='Integrity',null) as cia_security
,COALESCE (value='Confidentiality',null) as cia_conf
,COALESCE (value='Availability',null) as cia_availability
FROM
staging_jira.issues__fields__customfield_10420
where _sdc_source_key_id='201496'
That is my result, I have duplicated rows:
What should be the best solution to achieve my transformation?
Thanks a lot!

You can GROUP By "_sdc_source_key_id" and use MAX of your values
Select "_sdc_source_key_id",
MAX(COALESCE(value='Integrity',null)) as cia_security
,MAX(COALESCE (value='Confidentiality',null)) as cia_conf
,MSX(COALESCE (value='Availability',null)) as cia_availability
FROM
staging_jira.issues__fields__customfield_10420
where _sdc_source_key_id='201496'
GROUP BY "_sdc_source_key_id"
If your databse doesn't support MAX from boolean switch to Int
Select "_sdc_source_key_id",
MAX(CASE WHEN value='Integrity' THEN 1 ELSE null END) as cia_security
,MAX(CASE WHEN value='Confidentiality' THEN 1 ELSE null END) as cia_conf
,MSX(CASE WHEN value='Availability' THEN 1 ELSE null END) as cia_availability
FROM
staging_jira.issues__fields__customfield_10420
where _sdc_source_key_id='201496'
GROUP BY "_sdc_source_key_id"

SQL Server 2014 - SQL Case statement on columns

This is the table I have :
For every unique TID, there are 2 records. For a unique TID if both records in a field is populated I want the name of the field. For example, for T01 : Field2 and Field4 have both records populated.
My current approach is I create a column with comma separated values with the field names :
INSERT INTO TEMP
SELECT *,
(CASE WHEN COUNT(IIF(Field1 IS NOT NULL,1,NULL)) = 2 THEN 'FIELD1' ELSE 'NO' END) + ',' +
(CASE WHEN COUNT(IIF(Field2 IS NOT NULL,1,NULL)) = 2 THEN 'FIELD2' ELSE 'NO' END) + ',' +
(CASE WHEN COUNT(IIF(Field3 IS NOT NULL,1,NULL)) = 2 THEN 'FIELD3' ELSE 'NO' END) + ',' +
(CASE WHEN COUNT(IIF(Field4 IS NOT NULL,1,NULL)) = 2 THEN 'FIELD4' ELSE 'NO' END) AS ATTR
FROM ORIGINAL_TABLE;
I then convert the comma separated column into multiple records :
SELECT *, S.ITEMS as ATTRIBUTES
FROM TEMP
CROSS APPLY DBO.SPLIT(ATTR, ',') S
WHERE S.ITEMS NOT LIKE '%NO%'
Consider T101 of the result obtained from above command, This gives me the output :
Edit : Apologies. It should be Field2 instead of Field1.
This does give me information on the fields for every unique TID that follows the condition but I want it to be more specific. I run this for very big data with over 100 columns so this approach is slow.
Is there a way to get this? Where I display just the fields that satisfy the condition and their values for both records in T101.
Edit : Apologies. It should be Field2 instead of Field1 in the table.
I am fairly new to SQL, any help would be much appreciated!

Your question is rather complicated, and I'm not 100% sure what you really want. But based on:
For a unique TID if both records in a field is populated I want the name of the field.
You can unpivot and aggregate. Assuming that your columns all have a similar data type, you can use:
SELECT t.tId, v.fieldname
FROM ORIGINAL_TABLE t CROSS APPLY
(VALUES ('Field1', Field1),
('Field2', Field2),
('Field3', Field3),
('Field4', Field4)
) v(fieldname, val)
GROUP BY t.tID, v.fieldname
HAVING COUNT(*) = COUNT(v.val) -- all populated

Is there a way to return all unique values within a given row

I am working in Oracle SQL and I have a data set with these headers
In this data set card number is unique, however, the customer can have duplicates in their Suggestion fields. Rather than going through and writing case statements, is there a way to keep only the unique values within the given row?
Please note, some customers will be left with more "unique" suggestions than others
For example:
My goal would be for my final output to look like this
As I have mentioned, previously, I would just write case statements saying
SELECT DISTINCT CARD_NUMBER
,SUGGESTION_1
,CASE
WHEN SUGGESTION_2 != SUGGESTION_1
THEN SUGGESTION_2
WHEN SUGGESTION_3 != SUGGESTION_1
THEN SUGGESTION_3
WHEN SUGGESTION_4 != SUGGESTION_1
THEN SUGGESTION_4
WHEN SUGGESTION_5 != SUGGESTION_1
THEN SUGGESTION_5
END AS SUGGESTION_2
CASE
WHEN SUGGESTION_2 != SUGGESTION_1
AND SUGGESTION_3 != SUGGESTION_1
AND SUGGESTION_3 != SUGGESTION_2
THEN SUGGESTION_3
`
I would do this until all unique values are left, and there just has to be an easier way
Any help would be EXTREMELY appreciated, thank you!

You can use union all and conditional aggregation. Here is the idea that puts the results in a single column:
select card, listagg(suggestion, ', ') within group (order by which) as suggestions
from (select card, suggestion, min(which) as which
from ((select card, 1 as which, suggestion_1 as suggestion from t) union all
(select card, 2, suggestion_2 from t) union all
(select card, 3, suggestion_3 from t) union all
(select card, 4, suggestion_4 from t) union all
(select card, 5, suggestion_5 from t)
) t
group by card, suggestion
) t
group by card;
You can do something similar with conditional aggregation if you want the values in separate columns.

I would try to pivot the table to long, and then back to wide
Setup:
create table testtbl
(
CARD_NUMBER int
,SUGGESTION_1 varchar2(100)
,SUGGESTION_2 varchar2(100)
,SUGGESTION_3 varchar2(100)
,SUGGESTION_4 varchar2(100)
,SUGGESTION_5 varchar2(100)
);
insert into testtbl values (1234,'G11','G4','G3','G2','G6');
insert into testtbl values (5678,'G4','G6','G6','G11','G6');
insert into testtbl values (9101,'G1','G3','G11','G4','G11');
Then the Query itself, first the pivoting to long. Here I use a function just to return the numbers from 1 to 5 - this is instead of joining the table 5 times to itself, this way it should only pass through the test table once.
I then use the analytical functionrow_number to sort the unique values according to their first placement.
The second select uses MAX to pivot back to wide
with cte AS
(
SELECT
CARD_NUMBER
,MIN(n.column_value ) n
,CASE n.column_value
WHEN 1 THEN f.SUGGESTION_1
WHEN 2 THEN f.SUGGESTION_2
WHEN 3 THEN f.SUGGESTION_3
WHEN 4 THEN f.SUGGESTION_4
WHEN 5 THEN f.SUGGESTION_5
END Suggestion
,ROW_NUMBER() OVER (PARTITION BY f.CARD_NUMBER ORDER BY MIN(n.column_value)) rn
FROM testtbl f
CROSS JOIN table(sys.odcinumberlist(1,2,3,4,5)) n
GROUP BY f.CARD_NUMBER,CASE n.column_value
WHEN 1 THEN f.SUGGESTION_1
WHEN 2 THEN f.SUGGESTION_2
WHEN 3 THEN f.SUGGESTION_3
WHEN 4 THEN f.SUGGESTION_4
WHEN 5 THEN f.SUGGESTION_5
END
)
SELECT
CARD_NUMBER
,MAX(CASE WHEN rn=1 THEN Suggestion ELSE '' end)SUGGESTION_1
,MAX(CASE WHEN rn=2 THEN Suggestion ELSE '' end)SUGGESTION_2
,MAX(CASE WHEN rn=3 THEN Suggestion ELSE '' end)SUGGESTION_3
,MAX(CASE WHEN rn=4 THEN Suggestion ELSE '' end)SUGGESTION_4
,MAX(CASE WHEN rn=5 THEN Suggestion ELSE '' end)SUGGESTION_5
FROM cte
GROUP BY CARD_NUMBER
ORDER BY CARD_NUMBER

SQL using CASE in SELECT with GROUP BY. Need CASE-value but get row-value

so basicially there is 1 question and 1 problem:
1. question - when I have like 100 columns in a table(and no key or uindex is set) and I want to join or subselect that table with itself, do I really have to write out every column name?
2. problem - the example below shows the 1. question and my actual SQL-statement problem
Example:
A.FIELD1,
(SELECT CASE WHEN B.FIELD2 = 1 THEN B.FIELD3 ELSE null FROM TABLE B WHERE A.* = B.*) AS CASEFIELD1
(SELECT CASE WHEN B.FIELD2 = 2 THEN B.FIELD4 ELSE null FROM TABLE B WHERE A.* = B.*) AS CASEFIELD2
FROM TABLE A
GROUP BY A.FIELD1
The story is: if I don't put the CASE into its own select statement then I have to put the actual rowname into the GROUP BY and the GROUP BY doesn't group the NULL-value from the CASE but the actual value from the row. And because of that I would have to either join or subselect with all columns, since there is no key and no uindex, or somehow find another solution.
DBServer is DB2.
So now to describing it just with words and no SQL:
I have "order items" which can be divided into "ZD" and "EK" (1 = ZD, 2 = EK) and can be grouped by "distributor". Even though "order items" can have one of two different "departements"(ZD, EK), the fields/rows for "ZD" and "EK" are always both filled. I need the grouping to consider the "departement" and only if the designated "departement" (ZD or EK) is changing, then I want a new group to be created.
SELECT
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END) AS ZD,
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END) AS EK,
TABLE.DISTRIBUTOR,
sum(TABLE.SOMETHING) AS SOMETHING,
FROM TABLE
GROUP BY
ZD
EK
TABLE.DISTRIBUTOR
TABLE.DEPARTEMENT
This here worked in the SELECT and ZD, EK in the GROUP BY. Only problem was, even if EK was not the designated DEPARTEMENT, it still opened a new group if it changed, because he was using the real EK value and not the NULL from the CASE, as I was already explaining up top.

And here ladies and gentleman is the solution to the problem:
SELECT
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END) AS ZD,
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END) AS EK,
TABLE.DISTRIBUTOR,
sum(TABLE.SOMETHING) AS SOMETHING,
FROM TABLE
GROUP BY
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END),
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END),
TABLE.DISTRIBUTOR,
TABLE.DEPARTEMENT
#t-clausen.dk: Thank you!
#others: ...

Actually there is a wildcard equality test.
I am not sure why you would group by field1, that would seem impossible in your example. I tried to fit it into your question:
SELECT FIELD1,
CASE WHEN FIELD2 = 1 THEN FIELD3 END AS CASEFIELD1,
CASE WHEN FIELD2 = 2 THEN FIELD4 END AS CASEFIELD2
FROM
(
SELECT * FROM A
INTERSECT
SELECT * FROM B
) C
UNION -- results in a distinct
SELECT
A.FIELD1,
null,
null
FROM
(
SELECT * FROM A
EXCEPT
SELECT * FROM B
) C
This will fail for datatypes that are not comparable

No, there's no wildcard equality test. You'd have to list every field you want tested individually. If you don't want to test each individual field, you could use a hack such as concatenating all the fields, e.g.
WHERE (a.foo + a.bar + a.baz) = (b.foo + b.bar + b.az)
but either way, you're listing all of the fields.

I might tend to solve it something like this
WITH q as
(SELECT
Department
, (CASE WHEN DEPARTEMENT = 1 THEN ZD
WHEN DEPARTEMENT = 2 THEN EK
ELSE null
END) AS GRP
, DISTRIBUTOR
, SOMETHING
FROM mytable
)
SELECT
Department
, Grp
, Distributor
, sum(SOMETHING) AS SumTHING
FROM q
GROUP BY
DEPARTEMENT
, GRP
, DISTRIBUTOR

If you need to find all rows in TableA that match in TableB, how about INTERSECT or INTERSECT DISTINCT?
select * from A
INTERSECT DISTINCT
select * from B
However, if you only want rows from A where the entire row matches the values in a row from B, then why does your sample code take some values from A and others from B? If the row matches on all columns, then that would seem pointless. (Perhaps your question could be explained a bit more fully?)

Counting null and non-null values in a single query

I have a table
create table us
(
a number
);
Now I have data like:
a
1
2
3
4
null
null
null
8
9
Now I need a single query to count null and not null values in column a

This works for Oracle and SQL Server (you might be able to get it to work on another RDBMS):
select sum(case when a is null then 1 else 0 end) count_nulls
, count(a) count_not_nulls
from us;
Or:
select count(*) - count(a), count(a) from us;

If I understood correctly you want to count all NULL and all NOT NULL in a column...
If that is correct:
SELECT count(*) FROM us WHERE a IS NULL
UNION ALL
SELECT count(*) FROM us WHERE a IS NOT NULL
Edited to have the full query, after reading the comments :]
SELECT COUNT(*), 'null_tally' AS narrative
FROM us
WHERE a IS NULL
UNION
SELECT COUNT(*), 'not_null_tally' AS narrative
FROM us
WHERE a IS NOT NULL;

Here is a quick and dirty version that works on Oracle :
select sum(case a when null then 1 else 0) "Null values",
sum(case a when null then 0 else 1) "Non-null values"
from us

for non nulls
select count(a)
from us
for nulls
select count(*)
from us
minus
select count(a)
from us
Hence
SELECT COUNT(A) NOT_NULLS
FROM US
UNION
SELECT COUNT(*) - COUNT(A) NULLS
FROM US
ought to do the job
Better in that the column titles come out correct.
SELECT COUNT(A) NOT_NULL, COUNT(*) - COUNT(A) NULLS
FROM US
In some testing on my system, it costs a full table scan.

As i understood your query, You just run this script and get Total Null,Total NotNull rows,
select count(*) - count(a) as 'Null', count(a) as 'Not Null' from us;

usually i use this trick
select sum(case when a is null then 0 else 1 end) as count_notnull,
sum(case when a is null then 1 else 0 end) as count_null
from tab
group by a

Just to provide yet another alternative, Postgres 9.4+ allows applying a FILTER to aggregates:
SELECT
COUNT(*) FILTER (WHERE a IS NULL) count_nulls,
COUNT(*) FILTER (WHERE a IS NOT NULL) count_not_nulls
FROM us;
SQLFiddle: http://sqlfiddle.com/#!17/80a24/5

This is little tricky. Assume the table has just one column, then the Count(1) and Count(*) will give different values.
set nocount on
declare #table1 table (empid int)
insert #table1 values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(NULL),(11),(12),(NULL),(13),(14);
select * from #table1
select COUNT(1) as "COUNT(1)" from #table1
select COUNT(empid) "Count(empid)" from #table1
Query Results
As you can see in the image, The first result shows the table has 16 rows. out of which two rows are NULL. So when we use Count(*) the query engine counts the number of rows, So we got count result as 16. But in case of Count(empid) it counted the non-NULL-values in the column empid. So we got the result as 14.
so whenever we are using COUNT(Column) make sure we take care of NULL values as shown below.
select COUNT(isnull(empid,1)) from #table1
will count both NULL and Non-NULL values.
Note: Same thing applies even when the table is made up of more than one column. Count(1) will give total number of rows irrespective of NULL/Non-NULL values. Only when the column values are counted using Count(Column) we need to take care of NULL values.

I had a similar issue: to count all distinct values, counting null values as 1, too. A simple count doesn't work in this case, as it does not take null values into account.
Here's a snippet that works on SQL and does not involve selection of new values.
Basically, once performed the distinct, also return the row number in a new column (n) using the row_number() function, then perform a count on that column:
SELECT COUNT(n)
FROM (
SELECT *, row_number() OVER (ORDER BY [MyColumn] ASC) n
FROM (
SELECT DISTINCT [MyColumn]
FROM [MyTable]
) items
) distinctItems

Try this..
SELECT CASE
WHEN a IS NULL THEN 'Null'
ELSE 'Not Null'
END a,
Count(1)
FROM us
GROUP BY CASE
WHEN a IS NULL THEN 'Null'
ELSE 'Not Null'
END

Here are two solutions:
Select count(columnname) as countofNotNulls, count(isnull(columnname,1))-count(columnname) AS Countofnulls from table name
OR
Select count(columnname) as countofNotNulls, count(*)-count(columnname) AS Countofnulls from table name

Try
SELECT
SUM(ISNULL(a)) AS all_null,
SUM(!ISNULL(a)) AS all_not_null
FROM us;
Simple!

If you're using MS Sql Server...
SELECT COUNT(0) AS 'Null_ColumnA_Records',
(
SELECT COUNT(0)
FROM your_table
WHERE ColumnA IS NOT NULL
) AS 'NOT_Null_ColumnA_Records'
FROM your_table
WHERE ColumnA IS NULL;
I don't recomend you doing this... but here you have it (in the same table as result)

use ISNULL embedded function.

All the answers are either wrong or extremely out of date.
The simple and correct way of doing this query is using COUNT_IF function.
SELECT
COUNT_IF(a IS NULL) AS nulls,
COUNT_IF(a IS NOT NULL) AS not_nulls
FROM
us

SELECT SUM(NULLs) AS 'NULLS', SUM(NOTNULLs) AS 'NOTNULLs' FROM
(select count(*) AS 'NULLs', 0 as 'NOTNULLs' FROM us WHERE a is null
UNION select 0 as 'NULLs', count(*) AS 'NOTNULLs' FROM us WHERE a is not null) AS x
It's fugly, but it will return a single record with 2 cols indicating the count of nulls vs non nulls.

This works in T-SQL. If you're just counting the number of something and you want to include the nulls, use COALESCE instead of case.
IF OBJECT_ID('tempdb..#us') IS NOT NULL
DROP TABLE #us
CREATE TABLE #us
(
a INT NULL
);
INSERT INTO #us VALUES (1),(2),(3),(4),(NULL),(NULL),(NULL),(8),(9)
SELECT * FROM #us
SELECT CASE WHEN a IS NULL THEN 'NULL' ELSE 'NON-NULL' END AS 'NULL?',
COUNT(CASE WHEN a IS NULL THEN 'NULL' ELSE 'NON-NULL' END) AS 'Count'
FROM #us
GROUP BY CASE WHEN a IS NULL THEN 'NULL' ELSE 'NON-NULL' END
SELECT COALESCE(CAST(a AS NVARCHAR),'NULL') AS a,
COUNT(COALESCE(CAST(a AS NVARCHAR),'NULL')) AS 'Count'
FROM #us
GROUP BY COALESCE(CAST(a AS NVARCHAR),'NULL')

Building off of Alberto, I added the rollup.
SELECT [Narrative] = CASE
WHEN [Narrative] IS NULL THEN 'count_total' ELSE [Narrative] END
,[Count]=SUM([Count]) FROM (SELECT COUNT(*) [Count], 'count_nulls' AS [Narrative]
FROM [CrmDW].[CRM].[User]
WHERE [EmployeeID] IS NULL
UNION
SELECT COUNT(*), 'count_not_nulls ' AS narrative
FROM [CrmDW].[CRM].[User]
WHERE [EmployeeID] IS NOT NULL) S
GROUP BY [Narrative] WITH CUBE;

SELECT
ALL_VALUES
,COUNT(ALL_VALUES)
FROM(
SELECT
NVL2(A,'NOT NULL','NULL') AS ALL_VALUES
,NVL(A,0)
FROM US
)
GROUP BY ALL_VALUES

select count(isnull(NullableColumn,-1))

if its mysql, you can try something like this.
select
(select count(*) from TABLENAME WHERE a = 'null') as total_null,
(select count(*) from TABLENAME WHERE a != 'null') as total_not_null
FROM TABLENAME

Just in case you wanted it in a single record:
select
(select count(*) from tbl where colName is null) Nulls,
(select count(*) from tbl where colName is not null) NonNulls
;-)

for counting not null values
select count(*) from us where a is not null;
for counting null values
select count(*) from us where a is null;

I created the table in postgres 10 and both of the following worked:
select count(*) from us
and
select count(a is null) from us

In my case I wanted the "null distribution" amongst multiple columns:
SELECT
(CASE WHEN a IS NULL THEN 'NULL' ELSE 'NOT-NULL' END) AS a_null,
(CASE WHEN b IS NULL THEN 'NULL' ELSE 'NOT-NULL' END) AS b_null,
(CASE WHEN c IS NULL THEN 'NULL' ELSE 'NOT-NULL' END) AS c_null,
...
count(*)
FROM us
GROUP BY 1, 2, 3,...
ORDER BY 1, 2, 3,...
As per the '...' it is easily extendable to more columns, as many as needed

Number of elements where a is null:
select count(a) from us where a is null;
Number of elements where a is not null:
select count(a) from us where a is not null;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Partial group by - sql

You can use regexp_substr to get the patterns being searched for and then group by the number of such occurrences. select regexp_substr(ws_path,'\/R_|\/PB_|\/ST_'), count(*) from workpaths group by regexp_substr(ws_path,'\/R_|\/PB_|\/ST_')

Related

Coalesce in duplicated values

SQL Server 2014 - SQL Case statement on columns

Is there a way to return all unique values within a given row

SQL using CASE in SELECT with GROUP BY. Need CASE-value but get row-value

Counting null and non-null values in a single query

Categories

Resources