MIN() MAX() BigQuery - Unexpected behaviour - google-bigquery

The result of querying
SELECT
Type
, val
, MIN(val) over (partition by Type) as min_val
, MAX(val) over (partition by Type) as max_val
FROM tabA
Gives the unexpected output
Type val min_val max_val
1 A -52.50 -50.00 -64.00
2 A -50.00 -50.00 -64.00
3 A -64.00 -50.00 -64.00
4 A -58.50 -50.00 -64.00
5 B -66.00 -35.33 -75.33
6 B -73.00 -35.33 -75.33
7 B -35.33 -35.33 -75.33
8 B -56.33 -35.33 -75.33
9 B -44.33 -35.33 -75.33
10 B -75.33 -35.33 -75.33
11 B -57.00 -35.33 -75.33
12 B -69.00 -35.33 -75.33
where min&max are reversed. Is there any possible explanation for this?

The value is not stored in a numerical type. Big query can order strings too, which is what is doing here.

Related

how can I get the row of value that also occurred in other row?

As a beginner to sql, I have a table like below in PostgreSQL.
id product type occured_at
1 A 6 2017-03-17 10:21:22.935278
1 B 6 2017-04-17 10:21:22.941801
1 B 8 2017-04-17 10:21:22.935278
1 B 8 2017-05-17 10:21:22.935278
1 D 8 2017-06-17 10:21:22.935278
2 C 4 2017-04-24 10:21:22.938517
3 A 8 2017-04-27 10:21:22.941801
4 C 8 2017-09-17 10:21:22.941801
4 C 6 2017-09-17 10:21:22.941801
5 D 8 2017-09-17 10:21:22.941801
The question is how can I get the id and product which has different types that occurred on the same date. Just like
id product
1 B
4 C
If anyone can help me out, I really appreciate it :)
A self join is one approach:
SELECT DISTINCT A.ID, A.Product
FROM YourTable A
JOIN YourTable B ON A.ID = B.ID
AND A.Product = B.Product
AND DATE(A.Occured_at) = DATE(B.Occured_at)
AND A.Type <> B.Type
SELECT product
FROM YourTable
GROUP BY product
date_trunc('dat', occured_at)
HAVING count(DISTINCT type) > 1

HiveQL - join multi-level subtotals to existing table

My goal is to determine the size of various organizations at various levels. Let's assume we have three organisations 'A', 'B', and 'C', each consisting of multiple department and having a further subdivision in teams with members., as outlined below:
Org. Dep. Tm. Member
A 1 I name1
A 1 I name2
A 1 I name3
A 1 II name4
A 2 I name5
A 2 I name6
B 1 I name7
B 1 II name8
B 1 II name9
B 1 II name10
B 2 I name11
B 2 I name12
B 2 II name13
B 2 II name14
B 2 III name15
B 2 III name16
C 1 I name17
C 1 I name18
C 1 I name19
C 1 I name20
C 1 I name21
Now, I'd like to know for each member how large their respective Org., Dep. and Tm. are, like this:
Org. Dep. Tm. Member org dep tm
A 1 I name1 6 4 3
A 1 I name2 6 4 3
A 1 I name3 6 4 3
A 1 II name4 6 4 1
A 2 I name5 6 2 2
A 2 I name6 6 2 2
B 1 I name7 10 4 1
B 1 II name8 10 4 3
B 1 II name9 10 4 3
B 1 II name10 10 4 3
B 2 I name11 10 6 2
B 2 I name12 10 6 2
B 2 II name13 10 6 2
B 2 II name14 10 6 2
B 2 III name15 10 6 2
B 2 III name16 10 6 2
C 1 I name17 5 5 5
C 1 I name18 5 5 5
C 1 I name19 5 5 5
C 1 I name20 5 5 5
C 1 I name21 5 5 5
My original idea was to do this with multiple LEFT JOINS to aggregate the different levels, but this scales very poorly as you need a new join for every aggregation level. Is there a way to do this efficiently in a single statement?
Use window functions:
select org, dep, tm,
count(*) over (partition by org) as org_cnt,
count(*) over (partition by org, dep) as dep_cnt,
count(*) over (partition by org, dep, tm) as tm_cnt
from t;
The columns are hierarchical so dep and tm need the higher levels of the hierarchy.
EDIT:
If Hive doesn't support count(distinct) and you need it, then you can do:
select org, dep, tm,
sum(case when seqnum_o = 1 then 1 else 0 end) over (partition by org) as org_cnt,
sum(case when seqnum_od = 1 then 1 else 0 end) over (partition by org, dep) as dep_cnt,
sum(case when seqnum_odt = 1 then 1 else 0 end) over (partition by org, dep, tm) as tm_cnt
from (select t.*,
row_number() over partition by org, memberid order by org) as seqnum_o,
row_number() over partition by org, dep, memberid order by org) as seqnum_od,
row_number() over partition by org, dep, tm, memberid order by org) as seqnum_odt
from t
) t;

See if all records in the same group are of accepted types

Consider the following table. Each document (id) belongs to a group (group_id).
-----------------------
id group_id value
-----------------------
1 1 A
2 1 B
3 1 D
4 2 A
5 2 B
6 3 C
7 4 A
8 4 B
9 4 B
10 4 B
11 4 C
12 5 A
13 5 A
14 5 A
15 6 B
16 6 NULL
17 6 NULL
18 6 D
19 7 NULL
20 8 B
1/ Each document has a value NULL, A, B, C or D
2/ If the documents in the same group all have either A or B as value, the group is completed
3/ In this case, the desired output would read:
---------------------
group_id completed
---------------------
1 0 <== because document 3 = D
2 1 <== all documents have either A or B as a value
3 0 <== only one document in the group, value C
4 1 <== all documents have either A or B as a value
5 1 <== all documents have value A
6 0 <== because of NULL values and value D
7 0 <== NULL
8 1 <== only one document, value B
IS it possible to query this resultset?
As I am not very experienced in SQL, any help would be appreciated!
Try this
SELECT [group_id],
CASE
WHEN Count(CASE WHEN [value] IN ( 'A', 'B' ) THEN 1 END) = Count(*) THEN 1
ELSE 0
END AS COMPLETED
FROM yourtable
GROUP BY [group_id]

SQL query for fetching data

hi i have a situation in sql as follows:
table name: case_details
caseid refno clientid report_date
1 1/1 1007 08-05-2013
2 1/2 1007 01-06-2013
3 1/3 1007 12-07-2013
4 1/4 1012 17-07-13
5 1/6 1009 08-07-13
table name: case_check_detail
caseid checkid alert_val
1 1 1
1 2 2
1 3 1
1 4 2
2 5 4
2 6 3
2 7 2
2 8 1
3 9 2
3 10 1
3 11 2
3 12 1
4 13 3
4 14 3
4 15 3
4 16 4
5 17 1
5 18 2
5 19 1
5 20 2
I want to count how many cases are there for clientid 1007 for whom the highest value of alert_val is 2 between 01-05-2013 to 18-07-2013
Like in this case its:
case id:1,caseid:3
Try
SELECT d.caseid
FROM case_details d JOIN case_check_detail c
ON d.caseid = c.caseid
WHERE d.clientid = 1007
AND d.report_date BETWEEN '20130501' AND '20130718'
GROUP BY d.caseid
HAVING MAX(c.alert_val) = 2
Output:
| CASEID |
----------
| 1 |
| 3 |
If you want to count them
SELECT COUNT(*) total
FROM
(
SELECT d.caseid
FROM case_details d JOIN case_check_detail c
ON d.caseid = c.caseid
WHERE d.clientid = 1007
AND d.report_date BETWEEN '20130501' AND '20130718'
GROUP BY d.caseid
HAVING MAX(c.alert_val) = 2
) q
Output:
| TOTAL |
---------
| 2 |
Here is SQLFiddle demo
SELECT COUNT(*)
FROM case_check_detail AS ccd
JOIN case_details AS cd ON cd.caseid=ccd.caseid
WHERE alert_val=2 AND report_date BETWEEN '2013-05-01' AND '2013-07-18'

mysql count from same table and data from the other table

I want to display the name of the registered users with the count of regid by supplying replyid, I don't know what will be the correct query to get the results
Here are the tables.
details_table
id regid replyid
-------------------
1 1 2
2 1 3
6 2 4
5 3 4
8 2 5
9 3 5
10 4 5
11 5 5
12 2 6
13 6 6
14 4 6
15 7 7
16 8 7
17 9 7
18 10 8
19 2 9
20 2 10
21 11 10
22 12 10
reg_table
id regname
---------------
1 Sam
2 Ash
3 Tina
4 Rohny
5 Martin
6 Natasha
7 Natalia
8 Kim
9 Alex
10 John
11 Neil
12 Peter
So if replyid i.e. (10) is select from details_table by where clause, it's suppose to display the 2,11,12 i.e. (Ash,Neil,Peter) from reg_table with the count of Ash=5,Neil=1,Peter=1
SELECT a.id, a.regname, COUNT(1)
FROM reg_table a, details_table b,
details_table c
WHERE b.replyid=10
AND b.regid = a.id
AND c.regid = a.id
GROUP BY a.id, a.regname
SELECT r.regid, r.regname, count(*)
FROM (
SELECT DISTINCT regid
FROM details_table
WHERE replyid = 10
) rg
JOIN reg_table r ON rg.regid = r.regid
JOIN details_table d ON r.regid = d.regid
GROUP BY r.regid
try this
select count(dt.regid) as cnt, regname from details_table dt, reg_table rt where dt.regid = rt.Regid and dt.replyid = 10 group by rt.Regid
SELECT reg_table.regname, count(*) from reg_table, details_table where details_table.regid = reg_table.id GROUP BY reg_table.id