Counting Combinations of rows/columns across one table

Counting Combinations of rows/columns across one table - sql

I have a query I am having trouble wrapping my head around. What I'm trying to do is come up with a report that will have rows of the major(Accounting, Business, etc) with columns of the type of enrollment(enrolled, withdrawn, etc) with counts for each. Right now, here is my query.
SELECT datatel_academicprogramofinterestidname, datatel_prospectstatusname
FROM FilteredContact
GROUP BY datatel_academicprogramofinterestidname, datatel_prospectstatusname
Which gives me every combination of these two fields found in my table. I want to get counts for each of the combinations to display. The rows would be the interestidname field, and the columns would be the prospectstatusname field. In every cell there would be a count for how many of that specific combination (Accounting/Enrolled, for example)
I've tried count in multiple ways but can't seem to get it to break the columns out the way I want. Not sure how to use group by, count, and my where clause all in conjunction to have it formatted how I want. The good thing is all my information is in one table, I just can't make it look how I want.
|Accounting (BS) | Accepted | 25|
|Acting (BFA) | Accepted | 32|
|Advertising & Marketing Communications (BA) | Accepted | 29|
|American Studies (BA) | Accepted | 2|
|Accounting (BS) | Enrolled | 5|
|Acting (BFA) | Enrolled | 17|
|Advertising & Marketing Communications (BA) | Enrolled | 40|
|American Studies (BA) | Enrolled | 10|

You may be able to use a pivot here. I created a SQL Fiddle to go along with this. Assuming your primary key field name is p_id
SELECT *
FROM FilteredContact
PIVOT ( count(p_id)
for datatel_prospectstatusname IN ([Accepted], [Enrolled])
) as p

You can easily turn the rows of data into columns using an aggregate function with a CASE expression:
select interestidname,
sum(case when datatel_prospectstatusname = 'Enrolled' then 1 else 0 end) Enrolled,
sum(case when datatel_prospectstatusname = 'Accepted' then 1 else 0 end) Accepted
from FilteredContact
group by interestidname

Related

SQL Help - Join small lookup table where not all columns are required (and an other option)

I have one large table with transactions and a smaller lookup table with values I want to add based on 4 common columns. The trick here is not every combination of these 4 columns will exist in the lookup table and there are scenarios where I want it to stop checking and accept the match instead of going to the next column. I also have an "Other" option to default to if it doesn't match any of the options.
Table structures are something like this:
transaction_table
country, trans_id, store_type, store_name, channel, browser, purchase_amount, currency
lookup_table
country, store_name, channel, browser, trans_fee
The data could be something like this:
transaction_table:
country| trans_id| store_type |store_name |channel |browser |amt |currency
US | 001 | Big Box | Target | B&M |N/A |1.45 |USD
US | 002 | Big Box | Target | Online |Chrome |1.79 |USD
US | 003 | Small | Bob's Store| B&M |N/A |2.50 |USD
US | 004 | Big Box | Walmart | B&M |N/A |1.12 |USD
US | 005 | Big Box | Walmart | Online |Firefox |3.79 |USD
US | 006 | Big Box | Amazon | Online |IE |4.54 |USD
US | 007 | Small | Jim's Plc | B&M |IE |2.49 |USD
lookup_table:
country|store_name |channel |browser |trans_fee
US |Target |B&M |N/A |0.25
US |Target |Online | |0.15
US |Walmart | | |0.30
US |Other | | |0.45
So looking at the lookup_table data:
Row 1 is very specific and would be a match on all 4 of the join
columns.
Row 2 would not care what browser was used to shop at Target so
regardless of the "browser" value, the trans_fee should come back
the same (other stores may care though).
Row 3 is saying any transaction with a country='US' and the
store_name='Walmart', regardless of the rest of the join columns
would have the same trans_fee
Row 4 is the "other" scenario where it should look first at the
store_name column and if it doesn't find a match, go to Other.
The lookup_table data can change and may end up being time dependent (start_date and end_date columns added) so it really wouldn't be a good candidate for a long, complex CASE statement.
I was thinking of a combination of checking each column with an IF IN statement but I'm hoping there's a more straightforward conditional join type statement I can use to go column by column and have an other option.
Thanks!
edit: I didn't specify this but I want to basically return all of the data from transaction_table and add the corresponding trans_fee to each line.

You will need to use a conditional JOIN.
Something like this
SELECT *
FROM lookup_table
LEFT OUTER JOIN transaction_table
ON CASE WHEN lookup_table.store_name IS NOT NULL
THEN transacton_table.store_name = lookup_table.store_name END

Such partial matching is tricky. And your problem is not really that well set up. You seem to have NULLs in some columns and general values in others.
In any case, you can solve this by matching what you can and then using order by to get the best match. In your case, I think this looks like this:
select tt.*,
(select trans_fee
from lookup l
where l.country = tt.country and
l.store_name in ('other', tt.store_name) and
(l.channel = tt.channel or l.channel is null) and
(l.browser = tt.browser or l. browser is null)
order by (case when l.store_name = tt.store_name then 1 else 2 end),
(case when l.channel = tt.channel then 1 else 2 end),
(case when l.browser = tt.browser then 1 else 2 end)
fetch first 1 row only
) as trans_fee
from transaction_table tt;
This is generic SQL. But the same idea should work in any database.

New column referencing second table - do I need a join?

I have two tables (first two shown) and need to make a third from the first two - do I need to do a join or can you reference a table without joining?
The third table shown is the desired output. Thanks for any help!
| ACC | CALL DATE | | |
+-----+-----------+--+--+
| 1 1 | 2/1/18 | | |
+-----+-----------+--
+-----+---------------+--+--+
| ACC | PURCHASE DATE | | |
+-----+---------------+--+--+
| 1 1 | 1/1/18 | | |
+-----+---------------+--+--+
+-----+-----------+----------------------+--+
| ACC | CALL DATE | PRIOR MONTH PURCHASE | |
+-----+-----------+----------------------+--+
| 1 1 | 2/1/18 | YES | |
+-----+-----------+----------------------+--+

Of course you can have a query that references multiple tables without joining. union all is an example of an operator that does that.
There is also the question of what you mean by "joining" in the question. If you mean explicit joins, there are ways around that -- such as correlated subqueries. However, these are implementing some form of "join" in the database engine.
As for your query, you would want to use exists with a correlated subquery:
select t1.*,
(case when exists (select 1
from table2 t2
where t2.acc = t1.acc and
datediff(month, t2.purchase_date, t1.call_date) = 1
)
then 'Yes' else 'No'
end) as prior_month_purchase
from table1 t1;
This is "better" than a join because it does not multiply or remove rows. The result set has exactly the rows in the first table, with the additional column.
The syntax assumes SQL Server (which was an original tag). Similar logic can be expressed in other databases, although date functions are notoriously database-dependent.

Lets check the options,
Say if you were to create a new third table on the basis of the data in first two, then every update/inserts/deletes to either of the tables should also propagate into the third table as well.
Say you instead have a view which does what you need, there isnt a need to maintain that third table and also gets you the data needed from the first two each time you query it.
create view third_table as
select a.acc,a.call_date,case when dateadd(mm,-1,a.call_date)=b.purchase_date then 'Yes' else 'No end as prior_month_purchase
from first_table a
left join second_table b
on a.acc=b.acc

Create VIEW (count duplicate values in column)

I have little project with SQL database which has table with some column.
Question: How create View in SQL Server, which count how many duplicate values I have in column and show that number in next column.
Here below you can see result which I want to take.
|id|name|count|
|1 |tom | |
|2 |tom | |
|3 |tom | |
| | | 3 |
|4 |leo | |
| | | 1 |

A view is simply a select statement with the words CREATE VIEW AS before the SELECT. This allows for example, 1 person (DBA) to maintain (create/alter) complex views, while another person (developer) only has the rights to select from them.
So to use #Stidgeon's answer (below):
CREATE VIEW MyCounts
AS
SELECT name, COUNT(id) AS counts
FROM table
GROUP BY name
and later you can query
Select * from MyCounts where counts > 1 order by name
or whatever you need to do. Note that order by is not allowed in views in SQL SERVER.

You can do what you want with grouping sets:
select id, name, count(*)
from t
group by grouping sets ((id, name), (name));
The group by on id, name is redundant; the value should always be "1". However, this allows the use of grouping sets, which is a convenient way to phrase the query.

Looks like you just want to count how many entries you have for each 'name', in which case you just need to do a simple COUNT query:
CREATE VIEW view_name AS
SELECT name, COUNT(id) AS counts
FROM table
GROUP BY name
The output in your case would be:
name counts
--------------
Tom 3
Leo 1

SQL join two tables using value from one as column name for other

I'm a bit stumped on a query I need to write for work. I have the following two tables:
|===============Patterns==============|
|type | bucket_id | description |
|-----------------------|-------------|
|pattern a | 1 | Email |
|pattern b | 2 | Phone |
|==========Results============|
|id | buc_1 | buc_2 |
|-----------------------------|
|123 | pass | |
|124 | pass |fail |
In the results table, I can see that entity 124 failed a validation check in buc_2. Looking at the patterns table, I can see bucket 2 belongs to pattern b (bucket_id corresponds to the column name in the results table), so entity 124 failed phone validation. But how do I write a query that joins these two tables on the value of one of the columns? Limitations to how this query is going to be called will most likely prevent me from using any cursors.

Some crude solutions:
SELECT "id", "description" FROM
Results JOIN Patterns
ON "buc_1" = 'fail' AND "bucket_id" = 1
union all
SELECT "id", "description" FROM
Results JOIN Patterns
ON "buc_2" = 'fail' AND "bucket_id" = 2
Or, with a very probably better execution plan:
SELECT "id", "description" FROM
Results JOIN Patterns
ON "buc_1" = 'fail' AND "bucket_id" = 1
OR "buc_2" = 'fail' AND "bucket_id" = 2;
This will report all failure descriptions for each id having a fail case in bucket 1 or 2.
See http://sqlfiddle.com/#!4/a3eae/8 for a live example
That being said, the right solution would be probably to change your schema to something more manageable. Say by using an association table to store each failed test -- as you have in fact here a many to many relationship.

An other approach if you are using Oracle ≥ 11g, would be to use the UNPIVOT operation. This will translate columns to rows at query execution:
select * from Results
unpivot ("result" for "bucket_id" in ("buc_1" as 1, "buc_2" as 2))
join Patterns
using("bucket_id")
where "result" = 'fail';
Unfortunately, you still have to hard-code the various column names.
See http://sqlfiddle.com/#!4/a3eae/17

It looks to me that what you really want to know is the description(in your example Phone) of a Pattern entry given the condition that the bucket failed. Regardless of the specific example you have you want a solution that fulfills that condition, not just your particular example.
I agree with the comment above. Your bucket entries should be tuples(rows) and not arguments, and also you should share the ids on each table so you can actually join them. For example, Consider adding a bucket column and index their number then just add ONE result column to store the state. Like this:
|===============Patterns==============|
|type | bucket_id | description |
|-----------------------|-------------|
|pattern a | 1 | Email |
|pattern b | 2 | Phone |
|==========Results====================|
|entity_id | bucket_id |status |
|-------------------------------------|
|123 | 1 |pass |
|124 | 1 |pass |
|123 | 2 | |
|124 | 2 |fail |
1.-Use an Inner Join: http://www.w3schools.com/sql/sql_join_inner.asp and the WHERE clause to filter only those buckets that failed:
2.-Would this example help?
SELECT Patterns.type, Patterns.description, Results.entity_id,Results.status
INNER JOIN Results
ON
Patterns.bucket_id=Results.bucket_id
WHERE
Results.status=fail
Lastly, I would also add a primary_key column to each table to make sure indexing is faster for each unique combination.
Thanks!

MS Access sum of 2 table in one query

I have 2 tables:
name "mfr"
name "pomfr"
Both have many columns, but some are same, and I want to sum of that similar column in one query based on one of them similar column group by
Data sample is
table1. mfr
rfno|ppic|pcrt
101 | 10| .30
102 | 15| .50
103 | 18| .68
table2 pomfr
rfno|ppic|pcrt
101 |100 | 1.15
102 | 50 | 1.50
103 | 0 | 0
and result in query should be
mfrquery
rfno|ppic|pcrt
101|110 |1.45
102| 65 |2.00
103| 18 | .68

I'll be somewhat nice. This probably isn't the most efficient method, but it'll work...
select* into #temp from table1
union
select* from table2
select id,sum(ppic) as ppic, sum(pcrt) as pcrt from #temp group by id
What this says is, select everything from table 1 and use a union to table two and place it in a temporary table called #temp. Filter this to the variables and ranges you need.
Then the 2nd part says, take the sum of ppic and the sum of pcrt from the #temp table and group it by the id.
Since you're new to SO, for future reference, SO people aren't mean, they just want to see you put forth some sort of effort into the problem, I've gotten help SEVERAL times here. Very helpful community! Best of luck to you!

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Counting Combinations of rows/columns across one table - sql

You may be able to use a pivot here. I created a SQL Fiddle to go along with this. Assuming your primary key field name is p_id SELECT * FROM FilteredContact PIVOT ( count(p_id) for datatel_prospectstatusname IN ([Accepted], [Enrolled]) ) as p

Related

SQL Help - Join small lookup table where not all columns are required (and an other option)

New column referencing second table - do I need a join?

Create VIEW (count duplicate values in column)

SQL join two tables using value from one as column name for other

MS Access sum of 2 table in one query

Categories

Resources