sql query for getting items not rated by both users

sql query for getting items not rated by both users - sql

Say I have the following table:
+------------+------+--------+
| reviewerID | item | rating |
+------------+------+--------+
| 1 | 1 | 5|
| 1 | 2 | 5|
| 1 | 3 | 5|
| 2 | 4 | 5|
| 2 | 1 | 5|
| 2 | 2 | 5|
+------------+------+--------+
And I want to get the items not rated by reviewer 1 but rated by reviewer 2 and vice versa into one table. The output should be something like this:
+------------+------+--------+
| reviewerID | item | rating |
+------------+------+--------+
| 1 | 3 | 5|
| 2 | 4 | 5|
+------------+------+--------+

You could count the number of reviewers the items had (between those two reviewers) and only select those with one reviewer:
SELECT *
FROM mytable
WHERE item IN (SELECT item
FROM mytable
WHERE reviewerID IN (1, 2)
GROUP BY item
HAVING COUNT(*) = 1)

Here's what you need, to get the desired results....
SELECT a.* FROM Reviewer a
JOIN ( SELECT DISTINCT item FROM Reviewer
GROUP BY item
HAVING count(item) < 2) b
ON a.item = b.item
Hope it helps!!
Good luck!!

Related

Found number of rows before a value changes with a group by

I have a table like this one
CREATE TABLE Levels
([userid] int, [counter1] int, [counter2] int, [date] datetime)
;
The counter2 is an incremental value. The date is just the datetime the row was created. The counter1 is a field that can take different integer values. And the userid the id of the user.
This is an example of the data. You can find a bigger example with two users in sqlfiddle
| userid | counter1 | counter2 | date |
|--------|----------|----------|----------------------|
| 123 | 6 | 42 | 2010-07-31T00:12:28Z |
| 123 | 6 | 43 | 2010-11-20T00:11:15Z |
| 123 | 6 | 44 | 2011-03-12T00:15:07Z |
| 123 | 5 | 45 | 2011-07-02T01:11:09Z |
| 123 | 5 | 46 | 2011-10-22T00:24:18Z |
| 123 | 5 | 47 | 2012-02-10T23:51:54Z |
| 123 | 5 | 48 | 2012-06-01T23:43:26Z |
| 123 | 5 | 49 | 2012-09-21T23:43:59Z |
| 123 | 4 | 50 | 2013-01-11T23:52:43Z |
| 123 | 4 | 51 | 2013-05-03T23:49:25Z |
| 123 | 4 | 52 | 2013-08-23T23:48:24Z |
| 123 | 3 | 53 | 2013-12-14T00:01:20Z |
| 123 | 3 | 54 | 2014-04-04T23:45:45Z |
| 123 | 4 | 55 | 2014-07-25T23:44:34Z |
| 123 | 5 | 56 | 2014-11-14T23:46:11Z |
What I try to do is to count how many times the counter1 has the same value before it changes. Why the rest of the questions I found in stackoverflow didn't work?
The counter1 field can get the same value multiple times later on, which I don't want to count as the same case.
I am working in SQL Server 2008 and LAG function is not available
The desired result for the full example in sqlfiddle is
| userid | counter1 | count |
|--------|----------|-------|
| 123| 6| 3|
| 123| 5| 5|
| 123| 4| 3|
| 123| 3| 2|
| 123| 4| 1|
| 123| 5| 1|
| 123| 6| 2|
| 123| 5| 5|
| 123| 4| 2|
| 123| 5| 1|
| 123| 4| 5|
| 123| 5| 5|
| 345| 6| 2|
| 345| 6| 9|

This is a type of gaps-and-islands problem. Fortunately, you can use the difference of row numbers:
select userid, counter1, count(*)
from (select t.*,
row_number() over (partition by userid order by counter2) as seqnum,
row_number() over (partition by userid, counter1 order by counter2) as seqnum_2
from t
) t
group by userid, counter1, (seqnum - seqnum_2)
order by userid, min(counter2);
Note: This assumes that the ordering is based on counter2. If it is really based on date then you can use that column instead.
Why this works is a little tricky to explain. But if you look at the results from the subquery, you will see how the difference between the two row_number() values is constant when counter1 has the same value on adjacent rows.

YOu don't actually need LEAD and LAG here, however, getting to a supported version of SQL Server, where LAG (and LEAD) are available should be priority.
WITH YourTable AS(
SELECT *
FROM (VALUES(123,6,42,CONVERT(datetime2(0),'2010-07-31T00:12:28Z')),
(123,6,43,CONVERT(datetime2(0),'2010-11-20T00:11:15Z')),
(123,6,44,CONVERT(datetime2(0),'2011-03-12T00:15:07Z')),
(123,5,45,CONVERT(datetime2(0),'2011-07-02T01:11:09Z')),
(123,5,46,CONVERT(datetime2(0),'2011-10-22T00:24:18Z')),
(123,5,47,CONVERT(datetime2(0),'2012-02-10T23:51:54Z')),
(123,5,48,CONVERT(datetime2(0),'2012-06-01T23:43:26Z')),
(123,5,49,CONVERT(datetime2(0),'2012-09-21T23:43:59Z')),
(123,4,50,CONVERT(datetime2(0),'2013-01-11T23:52:43Z')),
(123,4,51,CONVERT(datetime2(0),'2013-05-03T23:49:25Z')),
(123,4,52,CONVERT(datetime2(0),'2013-08-23T23:48:24Z')),
(123,3,53,CONVERT(datetime2(0),'2013-12-14T00:01:20Z')),
(123,3,54,CONVERT(datetime2(0),'2014-04-04T23:45:45Z')),
(123,4,55,CONVERT(datetime2(0),'2014-07-25T23:44:34Z')),
(123,5,56,CONVERT(datetime2(0),'2014-11-14T23:46:11Z')))V(userid,counter1,counter2,date)),
Grps AS (
SELECT userid,
counter1,
counter2,
date,
ROW_NUMBER() OVER (PARTITION BY userid ORDER BY [date]) -
ROW_NUMBER() OVER (PARTITION BY userid,counter1 ORDER BY [date]) AS Grp
FROM YourTable)
SELECT userid,
counter1,
COUNT(*)
FROM Grps
GROUP BY userid,
counter1,
Grp;

Cross join remaining combinations

I am trying to build a table that would bring be a combination of all products that I could sell, based on the current ones.
Product Status Table
+-------------+--------------+----------------+
| customer_id | product_name | product_status |
+-------------+--------------+----------------+
| 1 | A | Active |
| 2 | B | Active |
| 2 | C | Active |
| 3 | A | Cancelled |
+-------------+--------------+----------------+
Now I am trying to cross join with a hard code table that would give be 4 rows per customer_id, based on all 4 product we have in our portfolio, and statuses that I would like to apply.
Portfolio Table
+--------------+------------+----------+
| product_name | status_1 | status_2 |
+--------------+------------+----------+
| A | Inelegible | Inactive |
| B | Inelegible | Inactive |
| C | Ineligible | Inactive |
| D | Inelegible | Inactive |
+--------------+------------+----------+
On my code I tried to use a CROSS JOIN in order to achieve 4 rows per customer_id. Unfortunately, for customers that have more than one product, I have double/triple rows.
This is my code:
SELECT
p.customer_id,
CASE WHEN p.product_name = pt.product_name THEN p.product_name ELSE pt.product_name END AS product_name,
CASE
WHEN p.product_name = pt.product_name THEN p.product_status
ELSE pt.status_1
END AS product_status
FROM
products AS p
CROSS JOIN
portfolio as pt
This is my current output:
+----+-------------+--------------+----------------+
| # | customer_id | product_name | product_status |
+----+-------------+--------------+----------------+
| 1 | 1 | A | Active |
| 2 | 1 | B | Inelegible |
| 3 | 1 | C | Inelegible |
| 4 | 1 | D | Inelegible |
| 5 | 2 | A | Ineligible |
| 6 | 2 | A | Ineligible |
| 7 | 2 | B | Active |
| 8 | 2 | B | Ineligible |
| 9 | 2 | C | Active |
| 10 | 2 | C | Ineligible |
| 11 | 2 | D | Ineligible |
| 12 | 2 | D | Ineligible |
| 13 | 3 | A | Cancelled |
| 14 | 3 | B | Ineligible |
| 15 | 3 | C | Ineligible |
| 16 | 3 | D | Ineligible |
+----+-------------+--------------+----------------+
As you may see, for the customer_id 2, I have two rows for each product having products B and C with different statuses then what I have on the product_status table.
What I would like to achieve, in this case, is a table with 12 rows, in which the current product/status from the product_status table is shown, and the remaining product/statuses from the portfolio table are added.
Expected output
+----+-------------+--------------+----------------+
| # | customer_id | product_name | product_status |
+----+-------------+--------------+----------------+
| 1 | 1 | A | Active |
| 2 | 1 | B | Inelegible |
| 3 | 1 | C | Inelegible |
| 4 | 1 | D | Inelegible |
| 5 | 2 | A | Ineligible |
| 6 | 2 | B | Active |
| 7 | 2 | C | Active |
| 8 | 2 | D | Ineligible |
| 9 | 3 | A | Cancelled |
| 10 | 3 | B | Ineligible |
| 11 | 3 | C | Ineligible |
| 12 | 3 | D | Ineligible |
+----+-------------+--------------+----------------+
Not sure if the CROSS JOIN is the best alternative, but now I am running out of ideas.

EDIT:
I thought of another cleaner solution. Do a cross join first, then a right join on the customer_id and product_name, and coalesce the product statuses.
SELECT customer_id, product_name, coalesce(product_status, status_1)
FROM products p
RIGHT JOIN (
SELECT *
FROM (SELECT DISTINCT customer_id FROM products) pro
CROSS JOIN portfolio
) pt
USING (customer_id, product_name)
ORDER BY customer_id, product_name
Old answer:
The idea is to include information of all product names for a customer_id into a list, and check whether the product in portfolio is in that list.
(SELECT customer_id, pt_product_name as product_name, first(status_1) as product_status
FROM (
SELECT
customer_id,
p.product_name as p_product_name,
pt.product_name as pt_product_name,
product_status,
status_1,
status_2,
collect_list(p.product_name) over (partition by customer_id) AS product_list
FROM products p
CROSS JOIN portfolio pt
)
WHERE NOT array_contains(product_list, pt_product_name)
GROUP BY customer_id, product_name)
UNION ALL
(SELECT customer_id, p_product_name as product_name, first(product_status) as product_status
FROM (
SELECT
customer_id,
p.product_name as p_product_name,
pt.product_name as pt_product_name,
product_status,
status_1,
status_2,
collect_list(p.product_name) over (partition by customer_id) AS product_list
FROM products p
CROSS JOIN portfolio pt)
WHERE array_contains(product_list, pt_product_name)
GROUP BY customer_id, product_name)
ORDER BY customer_id, product_name;
which gives
+-----------+------------+--------------+
|customer_id|product_name|product_status|
+-----------+------------+--------------+
| 1| A| Active|
| 1| B| Inelegible|
| 1| C| Ineligible|
| 1| D| Inelegible|
| 2| A| Inelegible|
| 2| B| Active|
| 2| C| Active|
| 2| D| Inelegible|
| 3| A| Cancelled|
| 3| B| Inelegible|
| 3| C| Ineligible|
| 3| D| Inelegible|
+-----------+------------+--------------+
FYI the chunk before UNION ALL gives:
+-----------+------------+--------------+
|customer_id|product_name|product_status|
+-----------+------------+--------------+
| 1| B| Inelegible|
| 1| C| Ineligible|
| 1| D| Inelegible|
| 2| A| Inelegible|
| 2| D| Inelegible|
| 3| B| Inelegible|
| 3| C| Ineligible|
| 3| D| Inelegible|
+-----------+------------+--------------+
And the chunk after UNION ALL gives:
+-----------+------------+--------------+
|customer_id|product_name|product_status|
+-----------+------------+--------------+
| 1| A| Active|
| 2| B| Active|
| 2| C| Active|
| 3| A| Cancelled|
+-----------+------------+--------------+
Hope that helps!

SQL to add position depending on multiple columns

I have a table that I am adding a position column in. I will need to add a numbered position to all rows already in the table. The numbering depends on 4 columns that would match each other between rows. For example
id| name| fax | cart| area |
1| jim | 1 | 4 | 1 |
2| jim | 1 | 4 | 1 |
3| jim | 2 | 4 | 1 |
4| jim | 2 | 4 | 1 |
5| bob | 1 | 4 | 1 |
6| bob | 1 | 4 | 1 |
7| bob | 2 | 5 | 1 |
8| bob | 2 | 5 | 2 |
9| bob | 2 | 5 | 2 |
10| bob | 2 | 5 | 2 |
would result with
id| name| fax | cart| area | position
1| jim | 1 | 4 | 1 | 1
2| jim | 1 | 4 | 1 | 2
3| jim | 2 | 4 | 1 | 1
4| jim | 2 | 4 | 1 | 2
5| bob | 1 | 4 | 1 | 1
6| bob | 1 | 4 | 1 | 2
7| bob | 2 | 5 | 1 | 1
8| bob | 2 | 5 | 2 | 1
9| bob | 2 | 5 | 2 | 2
10| bob | 2 | 5 | 2 | 3
I need an sql query that will iterate over the table and add the position.

Use row_number():
select
t.*,
row_number() over(partition by name, fax, cart, area order by id) position
from mytable t
If you wanted an update query:
update mytable as t
set position = rn
from (
select id, row_number() over(partition by name, fax, cart, area order by id) rn
from mytable
) x
where x.id = t.id

Converting rows to columns and keeping data pairs

I have the following problem in MSSQL: I have a table, which contains 4 columns.
Example table:
JunctionId | type| color| value
1 | a | red | 5|
1 | b | green | 10|
2 | a | orange | 40|
2 | b | yellow | 35|
3 | a | blue | 6|
3 | b | cyan | 9|
Now, I'd like the following result:
1 | a | red | 5 | b | green | 10
2 | a | orange | 40 | b | yellow | 35
3 | a | blue | 6 | b | cyan | 9
I tried using PIVOT, but it was returning multiple rows because of the different values. I would use selfjoin, but I have 12 different 'type'. Any ideas would be very welcomed!
(note: I can't use this stackoverflow table thingy... sorry)

Self join time
select a1.junctionid,
a1.type as a_type,
a1.color as a_color,
a1.value as a_value,
a2.type as b_type,
a2.color as b_color,
a2.value as b_value
from MyTable a1
inner join MyTable a2
on a1.junctionid = a2.junctionid
where a1.type = 'a'
and a2.type = 'b'

Select First Two Rows of Child Table

I need to return one row of data for a parent record. The parent record can have many child records, but I only want the first two rows (and the total count of rows for the parent).
Here is an example of data:
ParentTable
+-----------------------+
| ParentId | ParentData |
+-----------------------+
| 1| Stuff |
| 2| Things |
| 3| Foo |
| 4| Bar |
-------------------------
ChildTable
+-------------------------------+
| ChildId | ParentId| ChildData |
+-------------------------------+
| 1 | 1 | Alpha |
| 2 | 1 | Bravo |
| 3 | 2 | Charlie |
| 4 | 2 | Delta |
| 5 | 2 | Echo |
| 6 | 3 | Foxtrot |
---------------------------------
And here is my desired result:
+-----------------------------------------------------------------+
| ParentId | ParentData | ChildData1 | ChildData2 | ChildRowCount |
+-----------------------------------------------------------------+
| 1 | Stuff | Alpha | Bravo | 2 |
| 2 | Things | Charlie | Delta | 3 |
| 3 | Foo | Foxtrot | (NULL) | 1 |
| 4 | Bar | (NULL) | (NULL) | 0 |
-------------------------------------------------------------------
I'm not sure if this needs a sub-query, a temp table, or a JOIN or GROUP BY of some sort.
In the end I need to use this in SSIS, but I'm starting with a query and going to go from there.
What kind of query can accomplish this?

Use a derived table to number the childdata table's rows and count the number of childid's per parent. left join this to the parenttable and get the desired result.
select distinct p.parentid,p.parentdata,
max(case when c.rnum =1 then c.childata end) over(partition by p.parentid,p.parentdata) as childdata1,
max(case when c.rnum =2 then c.childata end) over(partition by p.parentid,p.parentdata) as childdata2,
coalesce(c.childrowcount,0) as childrowcount
from parenttable p
left join (select c.*
,row_number() over(partition by parentid order by childid) as rnum
,count(*) over(partition by parentid) as childrowcount
from childtable c) c
on c.parentid=p.parentid

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

sql query for getting items not rated by both users - sql

You could count the number of reviewers the items had (between those two reviewers) and only select those with one reviewer: SELECT * FROM mytable WHERE item IN (SELECT item FROM mytable WHERE reviewerID IN (1, 2) GROUP BY item HAVING COUNT(*) = 1)

Here's what you need, to get the desired results.... SELECT a.* FROM Reviewer a JOIN ( SELECT DISTINCT item FROM Reviewer GROUP BY item HAVING count(item) < 2) b ON a.item = b.item Hope it helps!! Good luck!!

Related

Found number of rows before a value changes with a group by

Cross join remaining combinations

SQL to add position depending on multiple columns

Converting rows to columns and keeping data pairs

Select First Two Rows of Child Table

Categories

Resources