SQL Server creating view need a statement after having - sql

I have this SQL Server table with this data:
ID Name Type
-------------------
1 ZZ INPUT
2 AA INPUT
3 CC OUTPUT
4 ZZ OUTPUT
5 AA INPUT
6 CC INPUT
7 KK OUTPUT
8 TT INPUT
9 CC OUTPUT
10 DD OUTPUT
As a result, I would like the only names that are used one time. And of the ones that are used ones only the OUTPUT type.
Correct result
ID Name Type
-------------------
1 KK OUTPUT
2 DD OUTPUT
I can do it by creating two views. Use the first view as a view in between. Can I achieve the result with one view?

you only need a group by query and checks for count(*) = 1
select row_number() over (order by Name) as ID, Name
from your_table
group by Name
having count(*) = 1
and min(Type) = 'OUTPUT';

Related

SQL compares the value of 2 columns and select the column with max value row-by-row

I have table something like:
GROUP
NAME
Value_1
Value_2
1
ABC
0
0
1
DEF
4
4
50
XYZ
6
6
50
QWE
6
7
100
XYZ
26
2
100
QWE
26
2
What I would like to do is to groupby group and select the name with highest value_1. If their value_1 are the same, compare and select the max with value_2. If they're still the same, select the first one.
The output will be something like:
GROUP
NAME
Value_1
Value_2
1
DEF
4
4
50
QWE
6
7
100
XYZ
26
2
The challenge for me here is I don't know how many categories in NAME so a simple case when is not working. Thanks for help
You can use window functions to solve the bulk of your problem:
select t.*
from (select t.*,
row_number() over (partition by group order by value1 desc, value2 desc) as seqnum
from t
) t
where seqnum = 1;
The one caveat is the condition:
If they're still the same, select the first one.
SQL tables represent unordered (multi-) sets. There is no "first" one unless a column specifies the ordering. The best you can do is choose an arbitrary value when all the other values are the same.
That said, you might have another column that has an ordering. If so, add that as a third key to the order by.

Transform table to one-hot encoding for many rows

I have a SQL table of the following format:
ID Cat
1 A
1 B
1 D
1 F
2 B
2 C
2 D
3 A
3 F
Now, I want to create a table with one ID per row, and multiple Cat's in a row. My desired output looks as follows:
ID A B C D E F
1 1 1 0 1 0 1
2 0 1 1 1 0 0
3 1 0 0 0 0 1
I have found:
Transform table to one-hot-encoding of single column value
However, I have more than 1000 Cat's, so I am looking for code to write this automatically, rather than manually. Who can help me with this?
First let me transform the data you pasted into an actual table:
WITH data AS (
SELECT REGEXP_EXTRACT(data2, '[0-9]') id, REGEXP_EXTRACT(data2, '[A-Z]') cat
FROM (
SELECT SPLIT("""1 A
1 B
1 D
1 F
2 B
2 C
2 D
3 A
3 F""", '\n') AS data1
), UNNEST(data1) data2
)
SELECT * FROM data
(try sharing a table next time)
Now we can do some manual 1-hot encoding:
SELECT id
, MAX(IF(cat='A',1,0)) cat_A
, MAX(IF(cat='B',1,0)) cat_B
, MAX(IF(cat='C',1,0)) cat_C
FROM data
GROUP BY id
Now we want to write a script that will automatically create the columns we want:
SELECT STRING_AGG(FORMAT("MAX(IF(cat='%s',1,0))cat_%s", cat, cat), ', ')
FROM (
SELECT DISTINCT cat
FROM data
ORDER BY 1
)
That generates a string that you can copy paste into a query, that 1-hot encodes your arrays/rows:
SELECT id
,
MAX(IF(cat='A',1,0))cat_A, MAX(IF(cat='B',1,0))cat_B, MAX(IF(cat='C',1,0))cat_C, MAX(IF(cat='D',1,0))cat_D, MAX(IF(cat='F',1,0))cat_F
FROM data
GROUP BY id
And that's exactly what the question was asking for. You can generate SQL with SQL, but you'll need to write a new query using that result.
BigQuery has no dynamic column with standardSQL, but depending on what you want to do on the next step, there might be a way to make it easier.
Following code sample groups Cat by ID and uses a JavaScript function to do one-hot encoding and return JSON string.
CREATE TEMP FUNCTION trans(cats ARRAY<STRING>)
RETURNS STRING
LANGUAGE js
AS
"""
// TODO: Doing one hot encoding for one cat and return as JSON string
return "{a:1}";
"""
;
WITH id_cat AS (
SELECT 1 as ID, 'A' As Cat UNION ALL
SELECT 1 as ID, 'B' As Cat UNION ALL
SELECT 1 as ID, 'C' As Cat UNION ALL
SELECT 2 as ID, 'A' As Cat UNION ALL
SELECT 3 as ID, 'C' As Cat)
SELECT ID, trans(ARRAY_AGG(Cat))
FROM id_cat
GROUP BY ID;

CTE - sum multiple rows with same id (with conditions)

I want to use SQL to sum multiple rows with same id when wk_days > 10
Sample data :
staff_id wk_days
--------------------
A 5
B 27
B 4
C 13
D 5
Output data :
staff_id wk_days
--------------------
A 5
B 31
C 13
D 5
the above output data is I want, I think I can use CTE to do it. How can it write this SQL Query?
No CTE needed, that is a simple group by query:
select staff_id, sum(wk_days) as wk_days
from the_table
group by staff_id
order by staff_id;
Online example: https://rextester.com/MBE21399

SP to populate data as Column store style

I have that stores data in the usual way .
Id | Name | Number
----+------+-------
1 A 101
2 B 102
3 A 103
4 A 105
5 C 104
6 B 106
7 C 108
and so on.
Now I want to convert this table to something similar to column store.
For example all the facility should be ordered and grouped by the name.
Also if a new record arrives with the same the same Name, if should by assigned an ID, which is in the range assigned for that name group.
Just to elaborate. If 'A' has a ID range from 1 to 20 and currently in the table there are 5 ids, so when a new record arrives with Name A, it should be assigned the ID = 6.
Name goes with other names. Every time a ID is populates, the NextID in metatable has to incremented by 1.
As of now I have created a meta table which stores the Min, max ID along with next ID for each name group.
MetaTable
Name MinID MaxId NextID
---------------------------
A 1 30 6
B 31 60 45
C 61 100 78
And using case statements to populate the data in the mail Table. But its very inefficient and the query is long running.
Note: The Number column does not matter.
What could be a more efficient and faster way to achieve this?
SELECT Name,
MIN( ID ),
MAX( ID ),
COUNT(*) OVER ( PARTITION BY Name ) + 1 AS NextID
FROM YourTable
GROUP BY Name;

How to get an index of different category returned by "order by" sql in oracle?

We can easily get a sql result as following:
SQL>select Name, Value from table order by Name;
Name Value
------------
A 1
A 2
B 1
C 5
C 6
C 7
However, is there a way to link the name to a number so that an index of different names can be formed? Suppose we don't know how many different names are in the table and don't know what they are.
Name Value idx
-----------------
A 1 0
A 2 0
B 1 1
C 5 2
C 6 2
C 7 2
This can easily be done using a window function:
select Name,
Value,
dense_rank() over (order by name) - 1 as idx
from table
order by Name;