Generate frequency table from sql using Count with user defined condition

Generate frequency table from sql using Count with user defined condition - sql

Basically I need to generate a frequency table using sql, and I have a sample table like this:
user_id user_label code1 date
------ ----------- ----- ------
1 x a 01-01
1 x a 01-01
1 x a 01-02
1 x b 01-01
1 x c 01-02
1 y a 01-01
2 x a 01-01
etc
The rule to count occurrences is if two rows have the same user_id ,user_label and date ,then repeated codes should only be counted once.
For example, for the first two rows the frequency table should be :
user_id user_label code1 count_code_1
-------- ----------- ----- ------------
1 x a 1
Because even though there are two instances of a, but they happen on the same date so should only be counted once and I need do this for every unique codes in code_1 column
for all combinations of user_id + user_label
After processing the third row , the frequency table should be :
user_id user_label code_1 count_code_1
-------- ----------- ------ ------------
1 x a 2
Since although is the same code ('a') but it happens on a different date (01-02)
In the end, for the sample table given above, the desired result should be
user_id user_label code_1 count_code_1
-------- ----------- ------ -------------
1 x a 2
1 x b 1
1 x c 1
1 y a 1
2 x a 1
What I have so far is
select t.user_id, t.user_label, t.code_1, count(###)
from t
group by t.code_1,t.user_id, t.user_label
The problem is
1. I don't know what to put inside the count 2. I don't know how to incorporate the condition on date in to this query.
Any suggestion, correction would be greatly appreciated.

You seem to want count(distinct date):
select t.user_id, t.user_label, t.code_1,
count(distinct date)
from t
group by t.code_1,t.user_id, t.user_label

Related

Find rows that have both boolean in a column

I have oracle11 table like this:
id name have_child
----------- ---------- ------------
1 Alison N
2 Mary N
3 Meg Y
4 Mary N
5 Meg N
where have_child is probably Boolean = Y/N.
I want to do query to list wrong behavior where one name can be Y and N - like Meg:
id name have_child
----------- ---------- ------------
3 Meg Y
5 Meg N
As a result I want to list entire rows.
I do not want to list proper duplicates - like Mary:
id name have_child
----------- ---------- ------------
2 Mary N
4 Mary N
I know how to count particular names and list what names appears more than 1 time like this:
SELECT name from table
GROUP BY name
HAVING COUNT(*)>1;

This could be a way:
select id, name, have_child
from (
select t.*,
count(distinct have_child) over (partition by name) as num
from yourTable t
)
where num > 1
The inner query simply lists all the records of the table, adding a column which gives the number of different values of have_child for the same name.
The external one simply filters for rows in which this number is greater than 1.

Top 1 record of a grouped data in hive sql

I have a table with 3 different columns pid,org,amount as shown below.
pid org amount
---- ---- ------
1 1 5
1 1 6
2 1 2
2 1 4
I need the records grouped by pid and org with the maximum amount.
As,Rich functionalities of sql are not supported in hive need an easy way of obtaining it.
The result table should be like
pid org amount
---- ---- ------
1 1 6
2 1 4

select pid,org,max(amount) from table1 group by pid,org;

use max function
Returns the maximum value of the column in the group
select pid,org,max(amount) from data
group by pid,org;
if not work, convert amount in double;
select pid,org,max(CAST(amount as double)) from data
group by pid,org;

Including additional columns with COUNT DISTINCT query

I have a table that has the following columns: Netting_Pool, Counterparty and Account. My goal is to run a SQL query to show when there is a Netting_Pool with more than 1 Counterparty, and to show the Accounts linked to those Counterparties.
An example:
Netting_Pool Counterparty Account
1 ----- A ----- ASD
1 ----- A ----- XYZ
1 ----- B ----- DEF
2 ----- C ----- YUI
3 ----- D ----- TRE
4 ----- E ----- DDW
5 ----- F ----- QWE
I would like the query to have the following Return:
1 ----- A ----- ASD
1 ----- A ----- XYZ
1 ----- B ----- DEF
So far the closest I have come is the following:
SELECT netting_pool, count (distinct counterparty)
FROM Table
GROUP BY netting_pool
HAVING count(distinct counterparty) > 1'
Which returns:
Netting_Pool, Count (distinct Counterparty)
1 2
I have not been able to incorporate the Counterparty or Account values to my query and have it produce the results I want. Any help would be much appreciated!

Your query is aggregating, so you are only going to be getting one row. Another way to do this is with window/analytic functions, which are supported by most but not all databases.
Unfortunately, count(distinct) is not generally supported as a window function. But you can work around this by looking at the maximum and minimum values:
select Netting_Pool, Counterparty, Account
from (select t.*,
min(account) over (partition by Netting_Pool) as mina,
max(account) over (partition by Netting_Pool) as maxa
from table t
) t
where mina <> maxa;

querying for multiple records by date

I've got few tables like this:
Color
id Color_Name
--- -------
1 RED
2 GREEN
Color_Shades
id ColorId ShadeId date_created
--- ------- ------- --------------
1 1 55 03/15/2013
2 1 43 02/01/2012
3 2 13 05/15/2011
4 2 15 06/11/2009
I'm trying to get a list of all distinct colors with their latest date.
I tried
SELECT a.Color_Name, b.date_created FROM Color a, Color_Shades b
WHERE a.id = b.ColorId
but this is giving me mixed results.
My desired results are:
Color_Name date_created
---------- ---------------
RED 03/15/2013
GREEN 05/15/2011

You are near to what you need. You just need to aggregate those columns using MAX to get theor latest date.
SELECT a.Color_name, MAX(b.date_created) date_created
FROM Color a
INNER JOIN Color_shades b
ON a.id = b.colorID
GROUP BY a.Color_Name
SQLFiddle Demo

Comparing two columns from different tables of different types in SQL

There is a column that exists in 2 tables. In table 1, this column contains values in binary form (int), 1 and 0, while the other table contains the column in form 'Y' and 'N'.
Essentially I need to display rows in table 1 that contain values that are different from values in table 2 for that column. How do I compute 1 to Y and 0 to N for comparison?
Example:
Table 1:
DateRecorded SchoolName StudentName isAbsent hasPassed
------------ ---------- ----------- -------- ---------
2011-04-03 ABC John Y Y
2011-04-05 ABC John N Y
Table 2:
DateRecorded SchoolName StudentName isAbsent hasPassed
------------ ---------- ----------- -------- ---------
2011-04-03 ABC John 0 1
2011-04-05 ABC John 0 1
Should return row:
2011-04-03 ABC John Y Y
from Table 1 as this row is conflicting with the same row in Table 2.

Try this:
SELECT * FROM tbl1
EXCEPT
SELECT
daterecorded,
schoolname,
studentname,
CASE isAbsent WHEN 1 THEN 'Y' WHEN 0 THEN 'N' END AS isAbsent,
CASE hasPassed WHEN 1 THEN 'Y' WHEN 0 THEN 'N' END AS hasPassed
FROM tbl2
SQL-Fiddle Demo

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Generate frequency table from sql using Count with user defined condition - sql

You seem to want count(distinct date): select t.user_id, t.user_label, t.code_1, count(distinct date) from t group by t.code_1,t.user_id, t.user_label

Related

Find rows that have both boolean in a column

Top 1 record of a grouped data in hive sql

Including additional columns with COUNT DISTINCT query

querying for multiple records by date

Comparing two columns from different tables of different types in SQL

Categories

Resources