SUM on a single column based on different tables on SQL - sql

I have two tables, min_attribution and max_attribution which looks like this
session_id attribution
1 search
2 home
session_id attribution
1 search
2 other
And here is the MRS
CREATE TABLE min_attribution
(session_id INT,
attribution VARCHAR(20)
)
CREATE TABLE max_attribution
(session_id INT,
attribution VARCHAR(20)
)
Insert into min_attribution values (1,'search')
Insert into min_attribution values (2,'home')
Insert into max_attribution values (1,'search')
Insert into max_attribution values (2,'other')
I am trying to write a query where, depending on the value of attribution, a score is given and added for each user ID. For example, if in the first table the value for attribution is search, add 40 and do the same with the other table, but adding 30. Expected output:
session_id search home other
1 70 0 0
2 0 40 30
What I did was trying to create a column for each of the possible attribution values (there are only a few) and add the results from each table, starting with "search", but it is not adding properly. This is my query
SELECT min_attribution.session_id, SUM(
(CASE WHEN min_attribution.attribution = "search" THEN 40 ELSE 0 END) +
(CASE WHEN max_attribution.attribution = "search" THEN 30 ELSE 0 END)) search
FROM min_attribution,
max_attribution
GROUP BY min_attribution.session_id
And the resulting table (current output, only for the search column):
session_id search
1 110
2 30
Any ideas? ( I am using BigQuery)

I think you want union all:
select session_id,
40 * countif(attribute = 'search'),
40 * countif(attribute = 'home'),
40 * countif(attribute = 'other')
from ((select session_id, attribution
from min_attribution
) union all
(select session_id, attribution
from max_attribution
)
) s
group by session_id;

Related

Compare two rows (both with different ID) & check if their column values are exactly the same. All rows & columns are in the same table

I have a table named "ROSTER" and in this table I have 22 columns.
I want to query and compare any 2 rows of that particular table with the purpose to check if each column's values of that 2 rows are exactly the same. ID column always has different values in each row so I will not include ID column for the comparing. I will just use it to refer to what rows will be used for the comparison.
If all column values are the same: Either just display nothing (I prefer this one) or just return the 2 rows as it is.
If there are some column values not the same: Either display those column names only or display both the column name and its value (I prefer this one).
Example:
ROSTER Table:
ID
NAME
TIME
1
N1
0900
2
N1
0801
Output:
ID
TIME
1
0900
2
0801
OR
Display "TIME"
Note: Actually I'm okay with whatever result or way of output as long as I can know in any way that the 2 rows are not the same.
What are the possible ways to do this in SQL Server?
I am using Microsoft SQL Server Management Studio 18, Microsoft SQL Server 2019-15.0.2080.9
Please try the following solution based on the ideas of John Cappelletti. All credit goes to him.
SQL
-- DDL and sample data population, start
DECLARE #roster TABLE (ID INT PRIMARY KEY, NAME VARCHAR(10), TIME CHAR(4));
INSERT INTO #roster (ID, NAME, TIME) VALUES
(1,'N1','0900'),
(2,'N1','0801')
-- DDL and sample data population, end
DECLARE #source INT = 1
, #target INT = 2;
SELECT id AS source_id, #target AS target_id
,[key] AS [column]
,source_Value = MAX( CASE WHEN Src=1 THEN Value END)
,target_Value = MAX( CASE WHEN Src=2 THEN Value END)
FROM (
SELECT Src=1
,id
,B.*
FROM #roster AS A
CROSS APPLY ( SELECT [Key]
,Value
FROM OpenJson( (SELECT A.* For JSON Path,Without_Array_Wrapper,INCLUDE_NULL_VALUES))
) AS B
WHERE id=#source
UNION ALL
SELECT Src=2
,id = #source
,B.*
FROM #roster AS A
CROSS APPLY ( SELECT [Key]
,Value
FROM OpenJson( (SELECT A.* For JSON Path,Without_Array_Wrapper,INCLUDE_NULL_VALUES))
) AS B
WHERE id=#target
) AS A
GROUP BY id, [key]
HAVING MAX(CASE WHEN Src=1 THEN Value END)
<> MAX(CASE WHEN Src=2 THEN Value END)
AND [key] <> 'ID' -- exclude this PK column
ORDER BY id, [key];
Output
+-----------+-----------+--------+--------------+--------------+
| source_id | target_id | column | source_Value | target_Value |
+-----------+-----------+--------+--------------+--------------+
| 1 | 2 | TIME | 0900 | 0801 |
+-----------+-----------+--------+--------------+--------------+
A general approach here might be to just aggregate over the entire table and report the state of the counts:
SELECT
CASE WHEN COUNT(DISTINCT ID) = COUNT(*) THEN 'Yes' ELSE 'No' END AS [ID same],
CASE WHEN COUNT(DISTINCT NAME) = COUNT(*) THEN 'Yes' ELSE 'No' END AS [NAME same],
CASE WHEN COUNT(DISTINCT TIME) = COUNT(*) THEN 'Yes' ELSE 'No' END AS [TIME same]
FROM yourTable;

postgresql unnest and pivot int array column

I have below table
create table test(id serial, key int,type text,words text[],numbers int[] );
insert into test(key,type,words) select 1,'Name',array['Table'];
insert into test(key,type,numbers) select 1,'product_id',array[2];
insert into test(key,type,numbers) select 1,'price',array[40];
insert into test(key,type,numbers) select 1,'Region',array[23,59];
insert into test(key,type,words) select 2,'Name',array['Table1'];
insert into test(key,type,numbers) select 2,'product_id',array[1];
insert into test(key,type,numbers) select 2,'price',array[34];
insert into test(key,type,numbers) select 2,'Region',array[23,59,61];
insert into test(key,type,words) select 3,'Name',array['Chair'];
insert into test(key,type,numbers) select 3,'product_id',array[5];
I was using below query to pivot table for users.
select key,
max(array_to_string(words,',')) filter(where type='Name') as "Name",
cast(max(array_to_string(numbers,',')) filter(where type='product_id') as int) as "product_id",
cast(max(array_to_string(numbers,',')) filter(where type='price') as int) as "price" ,
max(array_to_string(numbers,',')) filter(where type='Region') as "Region"
from test group by key
But I couldn't unnest the Region column during Pivot in-order to use Region column to join with another table .
My expected output is below
Since we are using unnest("Region") to do to pivot. There must be a row with region data for each product.
Or below code will do the trick by creating an array of null.
unnest(CASE WHEN array_length("Region", 1) >= 1
THEN "Region"
ELSE '{null}'::int[] END)
Schema:
create table test(id serial, key int,type text,words text[],numbers int[] );
insert into test(key,type,words) select 1,'Name',array['Table'];
insert into test(key,type,numbers) select 1,'product_id',array[2];
insert into test(key,type,numbers) select 1,'price',array[40];
insert into test(key,type,numbers) select 1,'Region',array[23,59];
insert into test(key,type,words) select 2,'Name',array['Table1'];
insert into test(key,type,numbers) select 2,'product_id',array[1];
insert into test(key,type,numbers) select 2,'price',array[34];
insert into test(key,type,numbers) select 2,'Region',array[23,59,61];
insert into test(key,type,words) select 3,'Name',array['Chair'];
insert into test(key,type,numbers) select 3,'product_id',array[5];
select key,"Name",product_id,price,unnest(CASE WHEN array_length("Region", 1) >= 1
THEN "Region"
ELSE '{null}'::int[] END) from
(
select key,
max(array_to_string(words,',')) filter(where type='Name') as "Name",
cast(max(array_to_string(numbers,',')) filter(where type='product_id') as int) as "product_id",
cast(max(array_to_string(numbers,',')) filter(where type='price') as int) as "price" ,
max(numbers) filter(where type='Region') as "Region"
from test group by key
)t order by key
key
Name
product_id
price
unnest
1
Table
2
40
23
1
Table
2
40
59
2
Table1
1
34
23
2
Table1
1
34
59
2
Table1
1
34
61
3
Chair
5
null
null
db<>fiddle here
Very strange database design... I'm assuming you inherited it?
If none of the other array values will ever have a cardinality > 1 then, you can simply unnest:
select
key,
(max (words) filter (where type = 'Name'))[1] as name,
(max (numbers) filter (where type = 'product_id'))[1] as product_id,
(max (numbers) filter (where type = 'price'))[1] as price,
unnest (max (numbers) filter (where type = 'Region')) as region
from test
group by key
If they can have multiple values, that can also be handled.
-- EDIT 3/15/2021 --
Short version: an unnest against a null won't product a row, so if you coalesce the null value into an array of a single null element, that should take care of this part:
select
key,
(max (words) filter (where type = 'Name'))[1] as name,
(max (numbers) filter (where type = 'product_id'))[1] as product_id,
(max (numbers) filter (where type = 'price'))[1] as price,
unnest (coalesce (max (numbers) filter (where type = 'Region'), array[null]::integer[])) as region
from test
group by key
order by key
Now for the part you didn't ask... I and at least one other have been gently nudging you that your database design is going to cause multiple problems at every turn. The fact that it's in production doesn't mean you shouldn't fix it as soon as you can.
This design is what's known as EAV - Entity - Attribute - Value. It has its use cases, but like most good things it can also be applied when it shouldn't. The use case that comes to mind is if you want users to be able to dynamically add attributes to certain objects. Even then, there might be better/easier ways.
And as one example, if you have one million objects, five attributes means you have to store that as five million rows, and the majority of that space will be occupied with repeating the key and attribute names.
Just food for thought. We can continue to triage this with every new scenario you find, but it would be better to redo the design.

Pivot in SQL: count not working as expected

I have in my Oracle Responsys Database a table that contains records with amongst other two variables:
status
location_id
I want to count the number of records grouped by status and location_id, and display it as a pivot table.
This seems to be the exact example that appears here
But when I use the following request :
select * from
(select status,location_id from $a$ )
pivot (count(status)
for location_id in (0,1,2,3,4)
) order by status
The values that appear in the pivot table are just the column names :
output :
status 0 1 2 3 4
-1 0 1 2 3 4
1 0 1 2 3 4
2 0 1 2 3 4
3 0 1 2 3 4
4 0 1 2 3 4
5 0 1 2 3 4
I also gave a try to the following :
select * from
(select status,location_id , count(*) as nbreports
from $a$ group by status,location_id )
pivot (sum(nbreports)
for location in (0,1,2,3,4)
) order by status
but it gives me the same result.
select status,location_id , count(*) as nbreports
from $a$
group by status,location_id
will of course give me the values I want, but displaying them as a column and not as a pivot table
How can I get the pivot table to have in each cell the number of records with the status and location in row and column?
Example data:
CUSTOMER,STATUS,LOCATION_ID
1,-1,1
2,1,1
3,2,1
4,3,0
5,4,2
6,5,3
7,3,4
The table data types :
CUSTOMER Text Field (to 25 chars)
STATUS Text Field (to 25 chars)
LOCATION_ID Number Field
Please check if my understanding for your requirement is correct, you can do vice versa for the location column
create table test(
status varchar2(2),
location number
);
insert into test values('A',1);
insert into test values('A',2);
insert into test values('A',1);
insert into test values('B',1);
insert into test values('B',2);
select * from test;
select status,location,count(*)
from test
group by status,location;
select * from (
select status,location
from test
) pivot(count(*) for (status) in ('A' as STATUS_A,'B' as STATUS_B))

SQL Select with Priority

I need to select top 1 most valid discount for a given FriendId.
I have the following tables:
DiscountTable - describes different discount types
DiscountId, Percent, Type, Rank
1 , 20 , Friend, 2
2 , 10 , Overwrite, 1
Then I have another two tables (both list FriendIds)
Friends
101
102
103
Overwrites
101
105
I have to select top 1 most valid discount for a given FriendId. So for the above data this would be sample output
Id = 101 => gets "Overwrite" discount (higher rank)
Id = 102 => gets "Friend" discount (only in friends table)
Id = 103 => gets "Friend" discount (only in friends table)
Id = 105 => gets "Overwrite" discount
Id = 106 => gets NO discount as it does not exist in neither Friend and overwrite tables
INPUT => SINGLE friendId (int).
OUTPUT => Single DISCOUNT Record (DiscountId, Percent, Type)
Overwrites and Friend tables are the same. They only hold list of Ids (single column)
Having multiple tables of identical structure is usually bad practice, a single table with ID and Type would suffice, you could then use it in a JOIN to your DiscountTable:
;WITH cte AS (SELECT ID,[Type] = 'Friend'
FROM Friends
UNION ALL
SELECT ID,[Type] = 'Overwrite'
FROM Overwrites
)
SELECT TOP 1 a.[Type]
FROM cte a
JOIN DiscountTable DT
ON a.[Type] = DT.[Type]
WHERE ID = '105'
ORDER BY [Rank]
Note, non-existent ID values will not return.
This will get you all the FriendIds and the associate discount of the highest rank. It's an older hack that doesn't require using top or row numbering.
select
elig.FriendId,
min(Rank * 10000 + DiscountId) % 10000 as DiscountId
min(Rank * 10000 + Percent) % 10000 as Percent,
from
DiscountTable as dt
inner join (
select FriendId, 'Friend' as Type from Friends union all
select FriendId, 'Overwrite' from Overwrites
) as elig /* for eligible? */
on elig.Type = dt.Type
group by
elig.FriendId
create table discounts (id int, percent1 int, type1 varchar(12), rank1 int)
insert into discounts
values (1 , 20 , 'Friend', 2),
(2 , 10 , 'Overwrite', 1)
create table friends (friendid int)
insert into friends values (101),(102), (103)
create table overwrites (overwriteid int)
insert into overwrites values (101),(105)
select ids, isnull(percent1,0) as discount from (
select case when friendid IS null and overwriteid is null then 'no discount'
when friendid is null and overwriteid is not null then 'overwrite'
when friendid is not null and overwriteid is null then 'friend'
when friendid is not null and overwriteid is not null then (select top 1 TYPE1 from discounts order by rank1 desc)
else '' end category
,ids
from tcase left outer join friends
on tcase.ids = friends.friendid
left join overwrites
on tcase.ids = overwrites.overwriteid
) category1 left join discounts
on category1.category=discounts.type1

SQL count one field two times in select with different parameters

I like to have my query count one column two times in my select based on the value. So for example.
input: table
id | type
-------------|-------------
1 | 1
2 | 1
3 | 2
4 | 2
5 | 2
output: query (in 1 row, not two):
countfirst = 2 (two times 1)
countsecond = 3 (three times 2)
An default count in an select counts all rows in the query. But i like to count rows based
on an number without limiting the query. When using for example WHERE type = '1', type 2
gets filtered and cannot be counted anymore.
Is there an solution for this case in SQL?
--- EXAMPLE USE (situation above is simplefied but case is the same) ---
With one query i get all cars grouped by type from an table. There are two type signs: yellow (in db 1) and grey (in db 2). So in that query i have the folowing output:
Renault - ten times found - two yellow signs - eight grey signs
Create a table, script is given below.
CREATE TABLE [dbo].[temptbl](
[id] [int] NULL,
[type] [int] NULL
) ON [PRIMARY]
Execute the insert script as
insert into [temptbl] values(1,1)
insert into [temptbl] values(2,1)
insert into [temptbl] values(3,2)
insert into [temptbl] values(4,2)
insert into [temptbl] values(5,2)
Then execute the query.
;WITH cte as(
SELECT [type], Count([type]) cnt
FROM temptbl
GROUP BY [type]
)
SELECT * FROM cte
pivot (Sum([cnt]) for [type] in ([1],[2])) as AvgIncomePerDay
You can use the GROUP BY clause as Mureinik suggested, but with the addition of a WHERE clause to filter the results.
Below shows the results for type = 1 (assuming type is an INT:
SELECT type, COUNT(*) AS NoOfRecords
FROM table
WHERE type IN (1)
GROUP BY type
So if we wanted 1 and 2 we can use:
SELECT type, COUNT(*) AS NoOfRecords
FROM table
WHERE type IN (1, 2)
GROUP BY type
Lastly, that IN statement can pull type from another query:
SELECT type, COUNT(*) AS NoOfRecords
FROM table
WHERE type IN (SELECT type FROM someOtherTable)
GROUP BY type