I have a SQL query that returns data similar to this pseudo-table:
| Name | Id1 | Id2 | Guid |
|------+-----+-----+------|
| Joe | 1 | 1 | 1123 |
| Joe | 2 | 1 | 1123 |
| Joe | 3 | 1 | 1120 |
| Jeff | 1 | 1 | 1123 |
| Moe | 3 | 42 | 1120 |
I would like to display an additional column on the output, listing the total number of records that have matching GUIDs to a given row, like this:
| Name | Id1 | Id2 | Guid | # Matching |
+------+-----+-----+------+------------+
| Joe | 1 | 1 | 1123 | 3 |
| Joe | 2 | 1 | 1123 | 3 |
| Joe | 3 | 1 | 1120 | 2 |
| Jeff | 1 | 1 | 1123 | 3 |
| Moe | 3 | 42 | 1120 | 2 |
I was able to accomplish this by joining the query with itself, and doing a count. However, the query is rather large and takes awhile to complete, is there any way I can accomplish this without joining the query with itself?
You want a window function:
select t.*, count(*) over (partition by guid) as num_matching
from t;
Related
How can we use the GROUP BY clause in a table with rank function implemented on it in Postgresql?
candidate_id | name | status | phone_num | process_fk
--------------+---------+--------------+-----------+------------
1 | james | regular | 0985905 | 1
2 | kylie | probationary | 098889487 | 2
3 | Anne | regular | 095590 | 1
4 | Olivia | probationary | 09556451 | 3
5 | Charles | regular | 09445099 | 2
6 | Hannah | regular | 0977988 | 2
I'm quite new to SQL - hope you can help:
I have several tables that all have 3 columns in common: ObjNo, Date(year-month), Product.
Each table has 1 other column, that represents an economic value (sales, count, netsales, plan ..)
I need to join all tables on the 3 common columns giving. The outcome must have one row for each existing combination of the 3 common columns. Not every combination exists in every table.
If I do full outer joins, I get ObjNo, Date, etc. for each table, but only need them once.
How can I achieve this?
+--------------+-------+--------+---------+-----------+
| tblCount | | | | |
+--------------+-------+--------+---------+-----------+
| | ObjNo | Date | Product | count |
| | 1 | 201601 | Snacks | 22 |
| | 2 | 201602 | Coffee | 23 |
| | 4 | 201605 | Tea | 30 |
| | | | | |
| tblSalesPlan | | | | |
| | ObjNo | Date | Product | salesplan |
| | 1 | 201601 | Beer | 2000 |
| | 2 | 201602 | Sancks | 2000 |
| | 5 | 201605 | Tea | 2000 |
| | | | | |
| | | | | |
| tblSales | | | | |
| | ObjNo | Date | Product | Sales |
| | 1 | 201601 | Beer | 1000 |
| | 2 | 201602 | Coffee | 2000 |
| | 3 | 201603 | Tea | 3000 |
+--------------+-------+--------+---------+-----------+
Thx
Devon
It sounds like you're using SELECT * FROM... which is giving you every field from every table. You probably only want to get the values from one table, so you should be explicit about which fields you want to include in the results.
If you're not sure which table is going to have a record for each case (i.e. there is not guaranteed to be a record in any particular table) you can use the COALESCE function to get the first non-null value in each case.
SELECT COALESCE(tbl1.ObjNo, tbl2.ObjNo, tbl3.ObjNo) AS ObjNo, ....
tbl1.Sales, tbl2.Count, tbl3.Netsales
SQL beginner here. I've got a simple test that users take, and each row is the answer to one of their questions. They're allowed to take the exam once per day, so some people take it a second time on another day, and thus will have many rows with different test dates. What I'm basically trying to do is get each user's most recent score.
Here is what my data looks like (table name is dumdum):
+----------+----------------+----------+------------------+
| USERNAME | CORRECT_ANSWER | RESPONSE | DATE_TAKEN |
+----------+----------------+----------+------------------+
| matt | 1 | 1 | 3/23/15 1:04:26 |
| matt | 2 | 2 | 3/23/15 1:04:28 |
| matt | 3 | 3 | 3/23/15 1:04:23 |
| david | 1 | 3 | 3/20/15 1:04:25 |
| david | 2 | 2 | 3/20/15 1:04:28 |
| david | 3 | 1 | 3/20/15 1:04:30 |
| david | 1 | 1 | 3/21/15 11:03:14 |
| david | 2 | 3 | 3/21/15 11:03:17 |
| david | 3 | 2 | 3/21/15 11:03:19 |
| chris | 1 | 2 | 3/17/15 12:45:52 |
| chris | 2 | 2 | 3/17/15 12:45:56 |
| chris | 3 | 3 | 3/17/15 12:45:59 |
| peter | 1 | 1 | 3/19/15 2:45:33 |
| peter | 2 | 3 | 3/19/15 2:45:35 |
| peter | 3 | 2 | 3/19/15 2:45:38 |
| peter | 1 | 1 | 3/20/15 12:32:04 |
| peter | 2 | 2 | 3/20/15 12:32:05 |
| peter | 3 | 3 | 3/20/15 12:32:05 |
+----------+----------------+----------+------------------+
and what I'm trying to get in the end...
+----------+------------------+-------+
| USERNAME | MOST_RECENT_TEST | SCORE |
+----------+------------------+-------+
| matt | 3/23/2015 | 100 |
| david | 3/21/2015 | 33 |
| chris | 3/17/2015 | 67 |
| peter | 3/20/2015 | 100 |
+----------+------------------+-------+
I ran into some trouble because I need to go by day, and not by day/time, so I had to do a weird maneuver where I went to character and back to date... This is what I have so far, but I can't figure out how to use only the scores from the most recent test (right now it's factoring in all scores from every test ever taken)...
SELECT username, to_date(substr(max(test_date),1,9),'dd-MON-yy') as most_recent_test, round((sum(case when response=correct_answer then 1 end)/3)*100,0) as score
FROM dumdum group by username
Any help would be appreciated! Thanks!
There are several solutions to this problem this one uses the WITH clause and the RANK function.
It also uses the TRUNC function rather than to_date(substr(
with mxDate as
(SELECT USERNAME,
TRUNC(DATE_TAKEN) as MOST_RECENT_TEST,
CASE WHEN CORRECT_ANSWER = RESPONSE THEN 1 else 0 END as SCORE,
RANK () OVER (PARTITION BY USERNAME
ORDER BY TRUNC(DATE_TAKEN) DESC) Rk
FROM dumdum)
SELECT
USERNAME,
MOST_RECENT_TEST,
SUM(SCORE)/3 * 100
FROM
mxDate
WHERE
rk = 1
GROUP BY
USERNAME,
MOST_RECENT_TEST
Demo
i have table like this:
| ID | id_number | a | b |
| 1 | 1 | 0 | 215 |
| 2 | 2 | 28 | 8952 |
| 3 | 3 | 10 | 2000 |
| 4 | 1 | 0 | 215 |
| 5 | 1 | 0 |10000 |
| 6 | 3 | 10 | 5000 |
| 7 | 2 | 3 |90933 |
I want to sum a*b where id_number is same, what the query to get all value for every id_number? for example the result is like this :
| ID | id_number | result |
| 1 | 1 | 0 |
| 2 | 2 | 523455 |
| 3 | 3 | 70000 |
This is a simple aggregation query:
select id_number, sum(a*b)
from t
group by id_number
I'm not sure what the first column is for.
Ok, so i'm trying to write a complex query (at least complex to me) and need some pro help. This is my database setup:
Table: MakeList
| MakeListId | Make |
| 1 | Acura |
| 2 | Chevy |
| 3 | Pontiac |
| 4 | Scion |
| 5 | Toyota |
Table: CustomerMake
| CustomerMakeId | CustomerId | _Descriptor |
| 1 | 123 | Acura |
| 2 | 124 | Chevy |
| 3 | 125 | Pontiac |
| 4 | 126 | Scion |
| 5 | 127 | Toyota |
| 6 | 128 | Acura |
| 7 | 129 | Chevy |
| 8 | 130 | Pontiac |
| 9 | 131 | Scion |
| 10 | 132 | Toyota |
Table: Customer
| CustomerId | StatusId |
| 123 | 1 |
| 124 | 1 |
| 125 | 1 |
| 126 | 2 |
| 127 | 1 |
| 128 | 1 |
| 129 | 2 |
| 130 | 1 |
| 131 | 1 |
| 132 | 1 |
What i am trying to end up with is this...
Desired Result Set:
| Make | CustomerId|
| Acura | 123 |
| Chevy | 124 |
| Pontiac | 125 |
| Scion | 131 |
| Toyota | 127 |
I am wanting a list of unique Makes with one active (StatusId = 1) CustomerId to go with it. I'm assuming i'll have to do some GROUP BYs and JOINS but i haven't been able to figure it out. Any help would be greatly appreciated. Let me know if i haven't given enough info for my question. Thanks!
UPDATE: The script doesn't have to be performant - it will be used one time for testing purposes.
Something like this:
select cm._Descriptor,
min(cu.customerid)
from CustomerMake cm
join Customer cu on cuo.CustomerId = cm.CustomerId and cu.StatusId = 1
group by cm._Descriptor
I left out the MakeList table as it seems unnecessary because you are storing the full make name as _Descriptorin the CustomerMake table anyway (so the question is what is the MakeList table for? Why don't you store a FK to it in the CustomerMake table?)
You want to
(a) join the customer and customermake tables
(b) filter on customer.statusid
(c) group by customermake._descriptor
Depending on your RDBMS, you may need to explicitly apply a group function to customer.customerid to include it in the select list. Since you don't care which particular customerid is displayed, you could use MIN or MAX to just pick an essentially arbitrary value.