Group By with 'HAVING' clause on slick+play - sql

Imagine I have a SQL table grades which has amongst other fields, the name of the student and the result of the grade:
| student | grade |
|----------|:---------:|
| Harry | Good |
| Ron | Good |
| Harry | Average |
| Harry | Fail |
| Hermione | Excellent |
| Hermione | Excellent |
| Ron | Average |
| ..... | .... |
If I wanted to select all the students with at least two 'Excellent' and zero 'Fail' grades one could do:
select student
from grades
group by student
having
sum(case when grade = 'Excellent' then 1 else 0 end) >= 2 and
sum(case when grade = 'Fail' then 1 else 0 end)
How could I translate such a query into Slick?
On the documentation the 'Having' clause they give seems simpler.
gradesTables
.groupBy(._student)
.map{ case(student, group) => (student, ???)}
.filter(???)
.list
On a related note, why do I get an error with the following:
gradesTables
.groupBy(._student)
.map{ case(student, group) => (student, group.filter(_.grade == "Fail").length)}
.list
The error is:
slick.SlickTreeException: Cannot convert node to SQL Comprehension

The following code in Slick will generate the SQL you need:
val query: Query[(Rep[String], Rep[Option[Int]], Rep[Option[Int]]), (String, Option[Int], Option[Int]), Seq] =
grades.groupBy( _.student ).map{ case (student, group) =>
val groupList = group.map(_.grade)
val gradeExcel = groupList.map( grade =>
Case.If(grade === "Excellent").Then(1).Else(0) ).sum
val gradeFail = groupList.map( grade =>
Case.If(grade === "Fail").Then(1).Else(0) ).sum
(student, gradeExcel, gradeFail)
}.
filter( g => g._2 >= 2 && g._3 === 0 )
// ...
println("Generated SQL:\n" + query.result.statements)
// Generated SQL:
// List(
// select "STUDENT", sum((case when ("GRADE" = 'Excellent') then 1 else 0 end)),
// sum((case when ("GRADE" = 'Fail') then 1 else 0 end)) from "GRADES" group by "STUDENT"
// having (sum((case when ("GRADE" = 'Excellent') then 1 else 0 end)) >= 2) and
// (sum((case when ("GRADE" = 'Fail') then 1 else 0 end)) = 0)
// )
db.run(query.result.map(println))
// Vector((Hermione,Some(2),Some(0)))

Related

Rails Query - Group By with 2 groups

In my website my users have an attribute localidade. This specifies where they live.
I'm trying to do a query where I group the results the following way:
localidade | Number of Users
-------------+--------------
New York | 6
Not New York | 8
I want the number of users from New York and the number of users from anywhere else but New York.
I tried this:
User.group("lower(localidade) = 'new york'").count
but since I don't have any users from new york and only 1 not from New York it returns:
{false => 1}
Am I able to give aliases to groups? Is there any way of grouping this way the results?
I'm gonna use the results for a Pie Graph from Graphkick.
You could write your query :
User.group("lower(localidade)")
.select("CASE WHEN lower(localidade) = 'new york' THEN COUNT(id) END AS NewYork,
CASE WHEN lower(localidade) != 'new york' THEN COUNT(id) END AS Non-NewYork")
Since 9.4, you can use FILTER with aggregate expression :
User.group("lower(localidade)")
.select("COUNT(id) FILTER (WHERE lower(localidade) != 'new york') AS NonNewyork,
COUNT(id) FILTER (WHERE lower(localidade) = 'new york') AS Newyork")
I created a Table to explain and test the above sql, and they worked as expected :
[shreyas#rails_app_test (master)]$ rails db
psql (9.4.1)
Type "help" for help.
app_development=# select id, location, name from people;
id | location | name
----+----------+------
2 | X | foo
3 | X | foo
4 | Y | foo
(3 rows)
app_development=# SELECT COUNT(id) FILTER(WHERE lower(location) != 'x') AS Non_X_loc, COUNT(id) FILTER (WHERE lower(location) = 'x') AS X_loc FROM "people";
non_x_loc | x_loc
-----------+-------
1 | 2
(1 row)
Let me now, jump to the rails console, and test the equivalent Rails code :
[2] pry(main)> p = Person.select("COUNT(id) FILTER(WHERE lower(location) != 'x') AS Non_X_loc, COUNT(id) FILTER (WHERE lower(location) = 'x') AS X_loc ")
Person Load (0.5ms) SELECT COUNT(id) FILTER(WHERE lower(location) != 'x') AS Non_X_loc, COUNT(id) FILTER (WHERE lower(location) = 'x') AS X_loc FROM "people"
=> [#<Person:0x007fd85ed71980 id: nil>]
[3] pry(main)> p.first.attributes
=> {"id"=>nil, "non_x_loc"=>1, "x_loc"=>2}
[6] pry(main)> Person.group("lower(location)").select("CASE WHEN lower(location) = 'x' THEN COUNT(id) END AS X_loc, CASE WHEN lower(location) != 'x' THEN COUNT(id) END AS Non_X_loc")
Person Load (0.6ms) SELECT CASE WHEN lower(location) = 'x' THEN COUNT(id) END AS X_loc, CASE WHEN lower(location) != 'x' THEN COUNT(id) END AS Non_X_loc FROM "people" GROUP BY lower(location)
=> [#<Person:0x007fd8608281e8 id: nil>, #<Person:0x007fd860828008 id: nil>]
[7] pry(main)> p = _
=> [#<Person:0x007fd8608281e8 id: nil>, #<Person:0x007fd860828008 id: nil>]
[8] pry(main)> p.map { |rec| rec.attributes }
=> [{"id"=>nil, "x_loc"=>nil, "non_x_loc"=>1}, {"id"=>nil, "x_loc"=>2, "non_x_loc"=>nil}]
[9] pry(main)> p.map { |rec| rec.attributes.except('id') }
=> [{"x_loc"=>nil, "non_x_loc"=>1}, {"x_loc"=>2, "non_x_loc"=>nil}]
Update
You can remove those nil from DB level only :
Rails code :
[shreyas#rails_app_test (master)]$ rails c
Loading development environment (Rails 4.2.0)
[1] pry(main)> Person.group("lower(location)").select("CASE WHEN lower(location) = 'x' THEN COUNT(id) ELSE 0 END AS X_loc, CASE WHEN lower(location) != 'x' THEN COUNT(id) ELSE 0 END AS Non_X_loc")
Person Load (0.9ms) SELECT CASE WHEN lower(location) = 'x' THEN COUNT(id) ELSE 0 END AS X_loc, CASE WHEN lower(location) != 'x' THEN COUNT(id) ELSE 0 END AS Non_X_loc FROM "people" GROUP BY lower(location)
=> [#<Person:0x007fd858c100b0 id: nil>, #<Person:0x007fd860853e88 id: nil>]
[2] pry(main)> p = _
=> [#<Person:0x007fd858c100b0 id: nil>, #<Person:0x007fd860853e88 id: nil>]
[3] pry(main)> p.map { |rec| rec.attributes }
=> [{"id"=>nil, "x_loc"=>0, "non_x_loc"=>1}, {"id"=>nil, "x_loc"=>2, "non_x_loc"=>0}]
[4] pry(main)> p.map { |rec| rec.attributes.except('id') }
=> [{"x_loc"=>0, "non_x_loc"=>1}, {"x_loc"=>2, "non_x_loc"=>0}]
[5] pry(main)> p = Person.select("count(CASE WHEN lower(location) = 'x' THEN 1 END) AS X_loc, count(CASE WHEN lower(location) != 'x' THEN 1 END) AS Non_X_loc").group("lower(location)")
Person Load (0.9ms) SELECT count(CASE WHEN lower(location) = 'x' THEN 1 END) AS X_loc, count(CASE WHEN lower(location) != 'x' THEN 1 END) AS Non_X_loc FROM "people" GROUP BY lower(location)
=> [#<Person:0x007fd85b150f78 id: nil>, #<Person:0x007fd85b150230 id: nil>]
[6] pry(main)> p.map { |rec| rec.attributes.except('id') }
=> [{"x_loc"=>0, "non_x_loc"=>1}, {"x_loc"=>2, "non_x_loc"=>0}]
SQL
app_development=# select CASE WHEN lower(location) = 'x' THEN COUNT(id) ELSE 0 END AS X_loc, CASE WHEN lower(location) != 'x' THEN COUNT(id) ELSE 0 END AS Non_X_loc from people group by lower(location);
x_loc | non_x_loc
-------+-----------
0 | 1
2 | 0
(2 rows)
app_development=# select count(CASE WHEN lower(location) = 'x' THEN 1 END) AS X_loc, count(CASE WHEN lower(location) != 'x' THEN 1 END) AS Non_X_loc from people group by lower(location);
x_loc | non_x_loc
-------+-----------
0 | 1
2 | 0
(2 rows)
Update- II
The classical approach to get the output same as FILTER :
app_development=# select count(CASE WHEN lower(location) = 'x' THEN 1 END) AS X_loc, sum(CASE WHEN lower(location) != 'x' THEN 1 END) AS Non_X_loc from people;
x_loc | non_x_loc
-------+-----------
2 | 1
(1 row)
app_development=# select sum(CASE WHEN lower(location) = 'x' THEN 1 END) AS X_loc, sum(CASE WHEN lower(location) != 'x' THEN 1 END) AS Non_X_loc from people;
x_loc | non_x_loc
-------+-----------
2 | 1
(1 row)
app_development=# select id, location, name from people;
id | location | name
----+----------+------
2 | X | foo
3 | X | foo
4 | Y | foo
(3 rows)
app_development=#
And In Rails way :-
Loading development environment (Rails 4.2.0)
[1] pry(main)> p = Person.select("sum(CASE WHEN lower(location) = 'x' THEN 1 END) AS X_loc, sum(CASE WHEN lower(location) != 'x' THEN 1 END) AS Non_X_loc")
Person Load (0.6ms) SELECT sum(CASE WHEN lower(location) = 'x' THEN 1 END) AS X_loc, sum(CASE WHEN lower(location) != 'x' THEN 1 END) AS Non_X_loc FROM "people"
=> [#<Person:0x007fd85b6e6a78 id: nil>]
[2] pry(main)> p.first.attributes.except("id")
=> {"x_loc"=>2, "non_x_loc"=>1}
[3] pry(main)> p = Person.select("count(CASE WHEN lower(location) = 'x' THEN 1 END) AS X_loc, count(CASE WHEN lower(location) != 'x' THEN 1 END) AS Non_X_loc")
Person Load (0.5ms) SELECT count(CASE WHEN lower(location) = 'x' THEN 1 END) AS X_loc, count(CASE WHEN lower(location) != 'x' THEN 1 END) AS Non_X_loc FROM "people"
=> [#<Person:0x007fd85b77f098 id: nil>]
[4] pry(main)> p.first.attributes.except("id")
=> {"x_loc"=>2, "non_x_loc"=>1}
[5] pry(main)>
Honestly, what you have works fine, you just need to understand that if there's no value in the hash for true (or for false for that matter) then the value must default to zero, you can do that with .to_i on what will be a nil value. So, eg.:
ny_count = User.group("lower(localidade) = 'new york'").count
"New York: #{ny_count[true].to_i}
Not New York: #{ny_count[false].to_i}
"

How to retrieve unique rows where multiple children that reference it exist for different types?

SELECT * FROM Fruit
INNER JOIN Apple ON Fruit.Id = Apple.FruitId
WHERE Apple.Type = 1 AND Apple.Type = 3
I need to get unique rows of Fruit that have both Apples that are of type 1 AND 3. Apple.Type is considered unique, but I wouldn't think it matters though.
With these rows, this should return two rows with both Fruit #50 and #52. The most important part is the Fruit.Id, I don't need to return the Types, but just need to make sure every single Fruit returned has at least one Apple.Type = 1 and one Apple.Type = 3.
Apple { Id = 1, FruitId = 50, Type = 0 }
Apple { Id = 2, FruitId = 50, Type = 1 }
Apple { Id = 3, FruitId = 50, Type = 3 }
Apple { Id = 4, FruitId = 51, Type = 1 }
Apple { Id = 5, FruitId = 51, Type = 2 }
Apple { Id = 6, FruitId = 52, Type = 3 }
Apple { Id = 7, FruitId = 52, Type = 1 }
Apple { Id = 8, FruitId = 52, Type = 2 }
Fruit { Id = 50 }
Fruit { Id = 51 }
Fruit { Id = 52 }
I'm not quite sure how to use DISTINCT and/or GROUP BY in order to form this query.
Group your apples table by fruit id and pick the results that have both desired types. Use this to get your fruits.
SELECT *
FROM Fruit
WHERE id IN
(
SELECT FruitId
FROM Apple
WHERE Type IN (1,3)
GROUP BY FruitId
HAVING COUNT(DISTINCT Type) = 2
);
This would return the fruits with ID 50 and 52.
SELECT *
FROM Fruit
WHERE EXISTS (
SELECT 1 FROM Apple
WHERE Type = 1 AND Apple.FruitId = Fruit.Id
) AND EXISTS (
SELECT 1 FROM Apple
WHERE Type = 3 AND Apple.FruitId = Fruit.Id
)
Not the most efficient way, but transposing those columns out so you have multiple types per fruitid should do it.
create table type_1 as select FruitId, Type as Type1 from Apple where Type = 1;
create table type_3 as select FruitId, Type as Type3 from Apple where Type = 3;
create table Fruits as select distinct FruitId from Apple;
create table Fruit_Agg as select a.FruitId, b.Type1, c.Type3 from Fruits a left join type_1 b on a.FruitId = b.FruitId left join type_3 c on a.FruitId = c.FruitId;
create table Types_1and_3 as select FruitId from Fruit_Agg where Type1 = 1 and Type3 = 3;

Awk code with associative arrays -- array doesn't seem populated, but no error

Question: Why does it seem that date_list[d] and isin_list[i] are not getting populated, in the code segment below?
AWK Code (on GNU-AWK on a Win-7 machine)
BEGIN { FS = "," } # This SEBI data set has comma-separated fields (NSE snapshots are pipe-separated)
# UPDATE the lists for DATE ($10), firm_ISIN ($9), EXCHANGE ($12), and FII_ID ($5).
( $17~/_EQ\>/ ) {
if (date[$10]++ == 0) date_list[d++] = $10; # Dates appear in order in raw data
if (isin[$9]++ == 0) isin_list[i++] = $9; # ISINs appear out of order in raw data
print $10, date[$10], $9, isin[$9], date_list[d], d, isin_list[i], i
}
input data
49290,C198962542782200306,6/30/2003,433581,F5811773991200306,S5405611832200306,B5086397478200306,NESTLE INDIA LTD.,INE239A01016,6/27/2003,1,E9035083824200306,REG_DL_STLD_02,591.13,5655,3342840.15,REG_DL_INSTR_EQ,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
49291,C198962542782200306,6/30/2003,433563,F6292896459200306,S6344227311200306,B6110521493200306,GRASIM INDUSTRIES LTD.,INE047A01013,6/27/2003,1,E9035083824200306,REG_DL_STLD_02,495.33,3700,1832721,REG_DL_INSTR_EQ,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
49292,C198962542782200306,6/30/2003,433681,F6513202607200306,S1724027402200306,B6372023178200306,HDFC BANK LTD,INE040A01018,6/26/2003,1,E745964372424200306,REG_DL_STLD_02,242,2600,629200,REG_DL_INSTR_EQ,REG_DL_DLAY_D,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
49293,C7885768925200306,6/30/2003,48128,F4406661052200306,S7376401565200306,B4576522576200306,Maruti Udyog Limited,INE585B01010,6/28/2003,3,E912851176274200306,REG_DL_STLD_04,125,44600,5575000,REG_DL_INSTR_EQ,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
49294,C7885768925200306,6/30/2003,48129,F4500260787200306,S1312094035200306,B4576522576200306,Maruti Udyog Limited,INE585B01010,6/28/2003,4,E912851176274200306,REG_DL_STLD_04,125,445600,55700000,REG_DL_INSTR_EQ,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
49295,C7885768925200306,6/30/2003,48130,F6425024637200306,S2872499118200306,B4576522576200306,Maruti Udyog Limited,INE585B01010,6/28/2003,3,E912851176274200306,REG_DL_STLD_04,125,48000,6000000,REG_DL_INSTR_EU,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
output that I am getting
6/27/2003 1 INE239A01016 1 1 1
6/27/2003 2 INE047A01013 1 1 2
6/26/2003 1 INE040A01018 1 2 3
6/28/2003 1 INE585B01010 1 3 4
6/28/2003 2 INE585B01010 2 3 4
Expected output
As far as I can tell, the print is printing out correctly (i) $10 (the date) (ii) date[$10), the count for each date (iii) $9 (firm-ID called ISIN) (iv) isin[$9], the count for each ISIN (v) d (index of date_list, the number of unique dates) and (vi) i (index of isin_list, the number of unique ISINs). I should also get two more columns -- columns 5 and 7 below -- for date_list[d] and isin_list[i], which will have values that look like $10 and $9.
6/27/2003 1 INE239A01016 1 6/27/2003 1 INE239A01016 1
6/27/2003 2 INE047A01013 1 6/27/2003 1 INE047A01013 2
6/26/2003 1 INE040A01018 1 6/26/2003 2 INE040A01018 3
6/28/2003 1 INE585B01010 1 6/28/2003 3 INE585B01010 4
6/28/2003 2 INE585B01010 2 6/28/2003 3 INE585B01010 4
actual code I now use is
{ if (date[$10]++ == 0) date_list[d++] = $10;
if (isin[$9]++ == 0) isin_list[i++] = $9;}
( $11~/1|2|3|5|9|1[24]/ )) { ++BNR[$10,$9,$12,$5]}
END { { for (u = 0; u < d; u++)
{for (v = 0; v < i; v++)
{ if (BNR[date_list[u],isin_list[v]]>0)
BR=BNR[date_list[u],isin_list[v]]
{ print(date_list[u], isin_list[v], BR}}}}}
Thanks a lot to everyone.

fetch array at the same way count specific results

is there a possibility of fetch array while count the results
i'm trying to create an attendance monitoring which count the no. of presents, absents, total no. of meetings per sched. ID and Student ID
here is my database
SCHED_ID |STUDENT_ID |DATE | A_STAT
1234567 |2014-000003 |08/01/14 |Absent
123456 |2014-000003 |08/04/2014 |Present
1234567 |2014-000003 |08/10/2014 |Present
123456 |2014-000003 |08/10/2014 |Present
the output supposed to be like this
Subject Tot Num| Num of Days Present | Num of Days Absent
dasdasdasd 3 2 1
testing123 3 2 1
ASDASD 1 0 1
but its always been like this
Subject Tot Num | Num of Days Present | Num of Days Absent
dasdasdasd 3 1 1
testing123 3 1 1
ASDASD 3 1 1
$query = $this->db->query("SELECT * FROM (tbl_schedule c, tbl_subject d, tbl_attendance e) where e.SCHED_ID = c.SCHED_ID AND d.SUBJECT_CODE = c.SUBJECT_CODE AND (e.STUDENT_ID = '$si') GROUP BY e.SCHED_ID")->result_array();
$query1 = $this->db->query("SELECT * FROM (tbl_schedule c, tbl_subject d, tbl_attendance e) where e.SCHED_ID = c.SCHED_ID AND d.SUBJECT_CODE = c.SUBJECT_CODE AND e.A_STAT='Present' AND (e.STUDENT_ID = '$si') GROUP BY e.SCHED_ID AND e.A_STAT")->num_rows();
$query2 = $this->db->query("SELECT * FROM (tbl_schedule c, tbl_subject d, tbl_attendance e) where e.SCHED_ID = c.SCHED_ID AND d.SUBJECT_CODE = c.SUBJECT_CODE AND e.A_STAT='Absent' AND (e.STUDENT_ID = '$si') GROUP BY e.SCHED_ID AND e.A_STAT")->num_rows();
$query3 = $this->db->query("SELECT * FROM (tbl_schedule c, tbl_subject d, tbl_attendance e) where e.SCHED_ID = c.SCHED_ID AND d.SUBJECT_CODE = c.SUBJECT_CODE AND (e.STUDENT_ID = '$si') GROUP BY e.SCHED_ID ")->num_rows();
for($i=0;$i<sizeof($query);$i++){
$data[$i][0]=$query1; // present
$data[$i][1]=$query2; //absent
$data[$i][2]=$query3; //total
$data[$i][3]=$query[$i]['SUBJECT _DESC']; // subj
}
return $data;

Get all rows where 2 fields exist In Array for PDO SQL?

I have an array structured like this:
Array
(
[0] => Array
(
[0] => 1 //x
[1] => 3 //y
)
[1] => Array
(
[0] => 8 //x
[1] => 7 //y
)
[2] => Array
(
[0] => 9 //x
[1] => 9 //y
)
)
What I want to know is there a way to make a query to get all rows where 2 fields match any pair of values for the second level of arrays for example say i have 2 rows with:
| uid | id | x | y |
- - - - - - - - - - - - -
| 1 | 1 | 1 | 3 | //both x and y exist together
| 1 | 1 | 9 | 9 | //both x and y exist together
| 1 | 1 | 9 | 5 | //no combination do not select this
I'm trying to avoid looping the array and using SELECT every iteration, but would rather some way to do it directly in my query to lower the amount of looping.
IS this at all possible or is my only option to loop the array and query each one at a time...this to me seems quite intensive as the array grows in length!!
I was hoping maybe there is some in_array method for SQL?
My suggestion would be to generate a long query from the array you've provided.
<?php
$arr = array(
array(1,3),
array(8,7),
array(9,9)
);
function wherify($val) {
return "(`x` = ".$val[0]." AND `y` = ".$val[1].")";
}
$criteria = implode(" OR ", array_map("wherify", $arr));
$query = "SELECT * FROM `table` WHERE $criteria";
echo $query;
This would create a query that would look something like the following.
SELECT * FROM `table`
WHERE (`x` = 1 AND `y` = 3)
OR (`x` = 8 AND `y` = 7)
OR (`x` = 9 AND `y` = 9)
(execution)