Sql join two tables with having clause - sql

I have two tables
First_id | Text Second_id | First_id | Date | Email
I need to get all records from first table having count from second table with date null and email null.
I have sql:
Select * from first f join second s on f.id = s.first_id where date is null and email is null group by first_id having(count(s.id) < 10 or count(s.id) = 0)
It works well, but where I have all data and email filled on second table for id from first table I got no result.
Sample data:
First table
1 | one
2 | two
Second table
1 | 1 | NULL | NULL
1 | 1 | 2015-01-01 | NULL
1 | 2 | 2015-01-01 | NULL
1 | 2 | 2015-01-01 | NULL
I expect on output:
1 | one | 1
2 | two | 0
last column is count of entries from second with date and email NULL. My query returns
1 | one | 1
No second row

Do a left join, also it's good to specify which columns you want to show, otherwise you will get duplicates.
Select * from first f left join second s on f.id = s.first_id where date is null and email is null group by first_id having(count(s.id) < 10 or count(s.id) = 0)

SELECT t1.First_id, t2.Second_id
FROM t1
LEFT JOIN t2
ON t1.First_id = t2.First_id
INNER JOIN (
SELECT Second_id
FROM t2
GROUP BY Second_id
HAVING (COUNT(*) < 10 OR COUNT(*) = 0)
) _cc
ON t.Second_id = _cc.Second_id
WHERE t2.date IS NULL AND t2.email IS NULL;
A solution is to check the HAVING restrictions in a subquery that returns the ids you need for the rest joins.
When you use the GROUP BY statement it is good to select only the GROUP BY column or an aggregate function otherwise you might have unpredictable results.
https://dev.mysql.com/doc/refman/5.1/en/group-by-handling.html
How to include other grouped columns

Related

How to right join these series so that this query will return results where count = 0 in Postgresql SQL?

I have the following query that runs on a Postgresql database:
SELECT NULL AS fromdate,
l.eventlevel,
SUM(CASE
WHEN e.id IS NULL THEN 0
ELSE 1
END) AS COUNT
FROM event e
RIGHT JOIN
(SELECT generate_series(0, 3) AS eventlevel) l ON e.event_level = l.eventlevel
WHERE e.project_id = :projectId
GROUP BY l.eventlevel
ORDER BY l.eventlevel DESC
With the (trimmed) event table:
TABLE public.event
id uuid NOT NULL,
event_level integer NOT NULL
This is a variant for a bucketed query but with all data, hence the NULL fromdate.
I'm trying to get counts from the table event and counted per event_level. But I also want the number 0 to return when there aren't any events for that particular event_level. But the current right join is not doing that job. What am I doing wrong?
I also tried adding OR e.project_id IS null thinking it might be filtering out the 0 counts. Or would this work with a CROSS JOIN and if so how?
Current result:
+----------+------------+-------+
| fromdate | eventlevel | count |
+----------+------------+-------+
| null | 3 | 1 |
+----------+------------+-------+
Desired result:
+----------+------------+-------+
| fromdate | eventlevel | count |
+----------+------------+-------+
| null | 3 | 1 |
| null | 2 | 0 |
| null | 1 | 0 |
| null | 0 | 0 |
+----------+------------+-------+
I recommend avoiding RIGHT JOINs and using LEFT JOINs. They are just simpler for following the logic -- keep everything in the first table and matching rows in the subsequent ones.
Your issue is the placement of the filter -- it filters out the outer joined rows. So that needs to go into the ON clause. I would recommend:
SELECT NULL AS fromdate, gs.eventlevel,
COUNT(e.id) as count
FROM generate_series(0, 3) gs(eventlevel) LEFT JOIN
event e
ON e.event_level = gs.eventlevel AND e.project_id = :projectId
GROUP BY gs.eventlevel
ORDER BY gs.eventlevel DESC;
Note the other simplifications:
No subquery is needed for generate_series.
You can use COUNT() instead of your case logic.
You have to move the e.project_id condition from the WHERE clause to the ON clause to get true RIGHT JOIN result:
...
END) AS COUNT
FROM event e
RIGHT JOIN
(SELECT generate_series(0, 3) AS eventlevel) l ON e.event_level = l.eventlevel
AND e.project_id = :projectId
...

Comparing different columns in SQL for each row

after some transformation I have a result from a cross join (from table a and b) where I want to do some analysis on. The table for this looks like this:
+-----+------+------+------+------+-----+------+------+------+------+
| id | 10_1 | 10_2 | 11_1 | 11_2 | id | 10_1 | 10_2 | 11_1 | 11_2 |
+-----+------+------+------+------+-----+------+------+------+------+
| 111 | 1 | 0 | 1 | 0 | 222 | 1 | 0 | 1 | 0 |
| 111 | 1 | 0 | 1 | 0 | 333 | 0 | 0 | 0 | 0 |
| 111 | 1 | 0 | 1 | 0 | 444 | 1 | 0 | 1 | 1 |
| 112 | 0 | 1 | 1 | 0 | 222 | 1 | 0 | 1 | 0 |
+-----+------+------+------+------+-----+------+------+------+------+
The ids in the first column are different from the ids in the sixth column.
In a row are always two different IDs that are matched with each other. The other columns always have either 0 or 1 as a value.
I am now trying to find out how many values(meaning both have "1" in 10_1, 10_2 etc) two IDs have on average in common, but I don't really know how to do so.
I was trying something like this as a start:
SELECT SUM(CASE WHEN a.10_1 = 1 AND b.10_1 = 1 then 1 end)
But this would obviously only count how often two ids have 10_1 in common. I could make something like this for example for different columns:
SELECT SUM(CASE WHEN (a.10_1 = 1 AND b.10_1 = 1)
OR (a.10_2 = 1 AND b.10_1 = 1) OR [...] then 1 end)
To count in general how often two IDs have one thing in common, but this would of course also count if they have two or more things in common. Plus, I would also like to know how often two IDS have two things, three things etc in common.
One "problem" in my case is also that I have like ~30 columns I want to look at, so I can hardly write down for each case every possible combination.
Does anyone know how I can approach my problem in a better way?
Thanks in advance.
Edit:
A possible result could look like this:
+-----------+---------+
| in_common | count |
+-----------+---------+
| 0 | 100 |
| 1 | 500 |
| 2 | 1500 |
| 3 | 5000 |
| 4 | 3000 |
+-----------+---------+
With the codes as column names, you're going to have to write some code that explicitly references each column name. To keep that to a minimum, you could write those references in a single union statement that normalizes the data, such as:
select id, '10_1' where "10_1" = 1
union
select id, '10_2' where "10_2" = 1
union
select id, '11_1' where "11_1" = 1
union
select id, '11_2' where "11_2" = 1;
This needs to be modified to include whatever additional columns you need to link up different IDs. For the purpose of this illustration, I assume the following data model
create table p (
id integer not null primary key,
sex character(1) not null,
age integer not null
);
create table t1 (
id integer not null,
code character varying(4) not null,
constraint pk_t1 primary key (id, code)
);
Though your data evidently does not currently resemble this structure, normalizing your data into a form like this would allow you to apply the following solution to summarize your data in the desired form.
select
in_common,
count(*) as count
from (
select
count(*) as in_common
from (
select
a.id as a_id, a.code,
b.id as b_id, b.code
from
(select p.*, t1.code
from p left join t1 on p.id=t1.id
) as a
inner join (select p.*, t1.code
from p left join t1 on p.id=t1.id
) as b on b.sex <> a.sex and b.age between a.age-10 and a.age+10
where
a.id < b.id
and a.code = b.code
) as c
group by
a_id, b_id
) as summ
group by
in_common;
The proposed solution requires first to take one step back from the cross-join table, as the identical column names are super annoying. Instead, we take the ids from the two tables and put them in a temporary table. The following query gets the result wanted in the question. It assumes table_a and table_b from the question are the same and called tbl, but this assumption is not needed and tbl can be replaced by table_a and table_b in the two sub-SELECT queries. It looks complicated and uses the JSON trick to flatten the columns, but it works here:
WITH idtable AS (
SELECT a.id as id_1, b.id as id_2 FROM
-- put cross join of table a and table b here
)
SELECT in_common,
count(*)
FROM
(SELECT idtable.*,
sum(CASE
WHEN meltedR.value::text=meltedL.value::text THEN 1
ELSE 0
END) AS in_common
FROM idtable
JOIN
(SELECT tbl.id,
b.*
FROM tbl, -- change here to table_a
json_each(row_to_json(tbl)) b -- and here too
WHERE KEY<>'id' ) meltedL ON (idtable.id_1 = meltedL.id)
JOIN
(SELECT tbl.id,
b.*
FROM tbl, -- change here to table_b
json_each(row_to_json(tbl)) b -- and here too
WHERE KEY<>'id' ) meltedR ON (idtable.id_2 = meltedR.id
AND meltedL.key = meltedR.key)
GROUP BY idtable.id_1,
idtable.id_2) tt
GROUP BY in_common ORDER BY in_common;
The output here looks like this:
in_common | count
-----------+-------
2 | 2
3 | 1
4 | 1
(3 rows)

SQL SELECT id and count of items in same table

I have the following SQL table columns...
id | item | position | set
---------------------------
1 | 1 | 1 | 1
2 | 1 | 1 | 2
3 | 2 | 2 | 1
4 | 3 | 2 | 2
In a single query I need to get all the ids of rows that match set='1' while simultaneously counting how many instances in the same table that it's item number is referenced regardless of the set.
Here is what I've been tinkering with so far...
SELECT
j1.item,
(SELECT count(j1.item) FROM table_join AS j2) AS count
FROM
table_join AS j1
WHERE
j1.set = '1';
...though the subquery is returning multiple rows. With the above data the first item should have a count of 2, all the other items should have a count of 1.
This should work:
SELECT
j.id
, (SELECT COUNT(*) FROM table_join i WHERE i.item = j.item) AS count
FROM table_join j
WHERE set='1'
This is similar to your query, but the subquery is coordinated with the outer query with the WHERE clause.
Demo.
As an alternative worth testing for performance, you can use a JOIN instead of a dependent subquery;
SELECT tj.id, COUNT(tj2.id) count
FROM table_join tj
LEFT JOIN table_join tj2 ON tj.item = tj2.item
WHERE tj.`set`=1
GROUP BY tj.id
An SQLfiddle to test with.

How to fill missing dates by groups in a table in sql

I want to know how to use loops to fill in missing dates with value zero based on the start/end dates by groups in sql so that i have consecutive time series in each group. I have two questions.
how to loop for each group?
How to use start/end dates for each group to dynamically fill in missing dates?
My input and expected output are listed as below.
Input: I have a table A like
date value grp_no
8/06/12 1 1
8/08/12 1 1
8/09/12 0 1
8/07/12 2 2
8/08/12 1 2
8/12/12 3 2
Also I have a table B which can be used to left join with A to fill in missing dates.
date
...
8/05/12
8/06/12
8/07/12
8/08/12
8/09/12
8/10/12
8/11/12
8/12/12
8/13/12
...
How can I use A and B to generate the following output in sql?
Output:
date value grp_no
8/06/12 1 1
8/07/12 0 1
8/08/12 1 1
8/09/12 0 1
8/07/12 2 2
8/08/12 1 2
8/09/12 0 2
8/10/12 0 2
8/11/12 0 2
8/12/12 3 2
Please send me your code and suggestion. Thank you so much in advance!!!
You can do it like this without loops
SELECT p.date, COALESCE(a.value, 0) value, p.grp_no
FROM
(
SELECT grp_no, date
FROM
(
SELECT grp_no, MIN(date) min_date, MAX(date) max_date
FROM tableA
GROUP BY grp_no
) q CROSS JOIN tableb b
WHERE b.date BETWEEN q.min_date AND q.max_date
) p LEFT JOIN TableA a
ON p.grp_no = a.grp_no
AND p.date = a.date
The innermost subquery grabs min and max dates per group. Then cross join with TableB produces all possible dates within the min-max range per group. And finally outer select uses outer join with TableA and fills value column with 0 for dates that are missing in TableA.
Output:
| DATE | VALUE | GRP_NO |
|------------|-------|--------|
| 2012-08-06 | 1 | 1 |
| 2012-08-07 | 0 | 1 |
| 2012-08-08 | 1 | 1 |
| 2012-08-09 | 0 | 1 |
| 2012-08-07 | 2 | 2 |
| 2012-08-08 | 1 | 2 |
| 2012-08-09 | 0 | 2 |
| 2012-08-10 | 0 | 2 |
| 2012-08-11 | 0 | 2 |
| 2012-08-12 | 3 | 2 |
Here is SQLFiddle demo
I just needed the query to return all the dates in the period I wanted. Without the joins. Thought I'd share for those wanting to put them in your query. Just change the 365 to whatever timeframe you are wanting.
DECLARE #s DATE = GETDATE()-365, #e DATE = GETDATE();
SELECT TOP (DATEDIFF(DAY, #s, #e)+1)
DATEADD(DAY, ROW_NUMBER() OVER (ORDER BY number)-1, #s)
FROM [master].dbo.spt_values
WHERE [type] = N'P' ORDER BY number
The following query does a union with tableA and tableB. It then uses group by to merge the rows from tableA and tableB so that all of the dates from tableB are in the result. If a date is not in tableA, then the row has 0 for value and grp_no. Otherwise, the row has the actual values for value and grp_no.
select
dat,
sum(val),
sum(grp)
from
(
select
date as dat,
value as val,
grp_no as grp
from
tableA
union
select
date,
0,
0
from
tableB
where
date >= date '2012-08-06' and
date <= date '2012-08-13'
)
group by
dat
order by
dat
I find this query to be easier for me to understand. It also runs faster. It takes 16 seconds whereas a similar right join query takes 32 seconds.
This solution only works with numerical data.
This solution assumes a fixed date range. With some extra work this query can be adapted to limit the date range to what is found in tableA.

Mysql4: SQL for selecting one or zero record

Table layout:
CREATE TABLE t_order (id INT, custId INT, order DATE)
I'm looking for a SQL command to select a maximum of one row per order (the customer who owns the order is identified by a field named custId).
I want to select ONE of the customer's orders (doesn't matter which one, say sorted by id) if there is no order date given for any of the rows.
I want to retrieve an empty Resultset for the customerId, if there is already a record with given order date.
Here is an example. Per customer there should be one order at most (one without a date given). Orders that have already a date value should not appear at all.
+---------------------------------------------------------+
|id | custId | date |
+---------------------------------------------------------+
| 1 10 NULL |
| 2 11 2008-11-11 |
| 3 12 2008-10-23 |
| 4 11 NULL |
| 5 13 NULL |
| 6 13 NULL |
+---------------------------------------------------------+
|
|
| Result
\ | /
\ /
+---------------------------------------------------------+
|id | custId | date |
+---------------------------------------------------------+
| 1 10 NULL |
| |
| |
| |
| 5 13 NULL |
| |
+---------------------------------------------------------+
powered be JavE
Edit:
I've choosen glavić's answer as the correct one, because it provides
the correct result with slightly modified data:
+---------------------------------------------------------+
|id | custId | date |
+---------------------------------------------------------+
| 1 10 NULL |
| 2 11 2008-11-11 |
| 3 12 2008-10-23 |
| 4 11 NULL |
| 5 13 NULL |
| 6 13 NULL |
| 7 11 NULL |
+---------------------------------------------------------+
Sfossen's answer will not work when customers appear more than twice because of its where clause constraint a.id != b.id.
Quassnoi's answer does not work for me, as I run server version 4.0.24 which yields the following error:
alt text http://img25.imageshack.us/img25/8186/picture1vyj.png
For a specific customer it's:
SELECT *
FROM t_order
WHERE date IS NULL AND custId=? LIMIT 1
For all customers its:
SELECT a.*
FROM t_order a
LEFT JOIN t_order b ON a.custId=b.custID and a.id != b.id
WHERE a.date IS NULL AND b.date IS NULL
GROUP BY custId;
Try this:
SELECT to1.*
FROM t_order AS to1
WHERE
to1.date IS NULL AND
to1.custId NOT IN (
SELECT to2.custId
FROM t_order AS to2
WHERE to2.date IS NOT NULL
GROUP BY to2.custId
)
GROUP BY to1.custId
For MySQL 4:
SELECT to1.*
FROM t_order AS to1
LEFT JOIN t_order AS to2 ON
to2.custId = to1.custId AND
to2.date IS NOT NULL
WHERE
to1.date IS NULL AND
to2.id IS NULL
GROUP BY to1.custId
This query will use one pass over index on custId.
For each distinct custId it will use one subquery over same index.
No GROUP BY, no TEMPORARY and no FILESORT — efficient, if your table is large.
SELECT VERSION()
--------
'4.1.22-standard'
CREATE INDEX ix_order_cust_id ON t_order(custId)
SELECT id, custId, order_date
FROM (
SELECT o.*,
CASE
WHEN custId <> #c THEN
(
SELECT 1
FROM t_order oi
WHERE oi.custId = o.custId
AND order_date IS NOT NULL
LIMIT 1
)
END AS n,
#c <> custId AS f,
#c := custId
FROM
(
SELECT #c := -1
) r,
t_order o
ORDER BY custId
) oo
WHERE n IS NULL AND f
---------
1, 10, ''
5, 13, ''
First filter out rows with dates, then filter out any row that has a similar row with a lower id. This should work because the matching record with the least id is unique if id is unique.
select * from t_order o1
where date is null
and not exists (select * from t_order o2
where o2.date is null
and o1.custId = o2.custId
and o1.id > o2.id)