Select rows that are duplicates on two columns

Select rows that are duplicates on two columns - sql

I have data in a table. There are 3 columns (ID, Interval, ContactInfo). This table lists all phone contacts. I'm attempting to get a count of phone numbers that called twice on the same day and have no idea how to go about this. I can get duplicate entries for the same number but it does not match on date. The code I have so far is below.
SELECT ContactInfo, COUNT(Interval) AS NumCalls
FROM AllCalls
GROUP BY ContactInfo
HAVING COUNT(AllCalls.ContactInfo) > 1
I'd like to have it return the date, the number of calls on that date if more than 1, and the phone number.
Sample data:
|ID |Interval |ContactInfo|
|--------|------------|-----------|
|1 |3/1/2017 |8009999999 |
|2 |3/1/2017 |8009999999 |
|3 |3/2/2017 |8001234567 |
|4 |3/2/2017 |8009999999 |
|5 |3/3/2017 |8007771111 |
|6 |3/3/2017 |8007771111 |
|--------|------------|-----------|
Expected result:
|Interval |ContactInfo|NumCalls|
|------------|-----------|--------|
|3/1/2017 |8009999999 |2 |
|3/3/2017 |8007771111 |2 |
|------------|-----------|--------|

Just as juergen d suggested, you should try to add Interval in your GROUP BY. Like so:
SELECT AC.ContactInfo
, AC.Interval
, COUNT(*) AS qnty
FROM AllCalls AS AC
GROUP BY AC.ContactInfo
, AC.Interval
HAVING COUNT(*) > 1

The code should like this :
select Interval , ContactInfo, count(ID) AS NumCalls from AllCalls group by Interval, ContactInfo having count(ID)>1;

Related

If condition TRUE in a row (that is grouped)

Table:
|Months |ID|Commission|
|2020-01|1 |2312 |
|2020-02|2 |24412 |
|2020-02|1 |123 |
|... |..|... |
What I need:
COUNT(Months),
ID,
SUM(Commission),
Country
GROUP BY ID...
How it should look:
|Months |ID|Commission|
|4 |1 |5356 |
|6 |2 |5436 |
|... |..|... |
So I want to know how many months each ID received his commission, however (and that's the part where I ask for your help) if the ID is still receiving commission up to this month (current month) - I want to exclude him from the list. If he stopped receiving comm last month or last year, I want to see him in the table.
In other words, I want a table with old clients (who doesn't receive commission anymore)

Use aggregation. Assuming there is one row per month:
select id, count(*)
from t
group by id
having max(months) < date_format(now(), '%Y-%m');
Note this uses MySQL syntax, which was one of the original tags.

Oracle SQL - How to return the name with the highest ID ending in a certain number

I have a table structured like this where I need to get the ID's last number, how many people's ID ends with that number, and the person with the highest ID:
Members: |ID |Name |
-----------------
|123 |foo |
|456 |bar |
|789 |boo |
|1226|far |
The result I need to get looks something like this
|LAST_NUMBER |OCCURENCES |HIGHEST_ID_GUY |
---------------------------------------------
|3 |1 |foo |
|6 |2 |far |
|9 |1 |boo |
However, while I can get the first two results to display correctly, I have no idea how to display HIGHEST_ID_GUY. My code looks like this:
SELECT DISTINCT SUBSTR(id, LENGTH(id - 1), LENGTH(id)) AS LAST_NUMBER,
COUNT(*) AS OCCURENCES
/* This is where I need to add HIGHEST_ID_GUY */
FROM Members
GROUP BY SUBSTR(id, LENGTH(id - 1), LENGTH(id))
ORDER BY LAST_NUMBER
Any help appreciated :)

If id is a number, then use arithmetic operations:
select mod(id, 10) as last_digit,
count(*),
max(name) keep (dense_rank first order by id desc) as name_at_biggest
from t
group by mod(id, 10);
If id is a string, then you need to convert to a number or something similar to define the "highest id". For instance:
select substr(id, -1) as last_digit,
count(*),
max(name) keep (dense_rank first order by to_number(id) desc) as name_at_biggest
from t
group by substr(id, -1);

How do I select oldest date values per group while using DateSerial function in Microsoft Access, using SQL?

I need to select the earliest date out of a group of records with the same userID. However, the date field I'm using was in a string format organized as such: yyyymmdd.
So I used the DateSerial function to convert the dates to this format: mm/dd/yyyy. That was step one. Step two (which is where I need some help) is the grouping of UserIDs by oldest date. Any help would be greatly appreciated.
Current query:
SELECT
[userID],
[company],
DateSerial(Left([DateOfSale], 4), Mid([DateOfSale], 5, 2), Right([DateOfSale], 2)) AS SaleDate
FROM mytable
Result:
|userID| company | SaleDate |
_________________________________
|1 | catworld | 01/01/2005 |
|1 | catworld | 01/03/2017 |
|2 | fishworld| 05/05/2019 |
|3 | dogworld | 02/01/2005 |
|3 | dogworld | 02/03/2017 |
Desired Result:
|userID| company | SaleDate |
_________________________________
|1 | catworld | 01/01/2005 |
|2 | fishworld| 05/05/2019 |
|3 | dogworld | 02/01/2005 |

Consider CDate which can cast date strings to actual date/time values. However because yyymmdd is not a valid date format, add hyphens between date parts for proper casting. And do so at the table level with a new column. See DDL commands to be run separately or using Access GUI (table design > new field):
ALTER TABLE main_table ADD COLUMN Saledate_Actual Date;
UPDATE main_table SET SaleDate_Actual = CDate(
LEFT([DateOfSale], 4) & '-' &
MID([DateOfSale], 5, 2) & '-' &
RIGHT([DateOfSale], 2)
);
Then join an aggregate query to the main table.
SELECT m.*
FROM main_table m
INNER JOIN
(SELECT userID, MIN(Saledate_Actual) AS min_date
FROM main_table
GROUP BY userID) AS agg
ON m.userID = agg.userID
AND m.Saledate_Actual = agg.min_date
Maybe once day MS Access will have window functions. Please upvote my request to MS Access team (no need to log in to vote)!
SELECT m.*
FROM main_table m
WHERE m.Saledate_Actual = MIN(Saledate_Actual) OVER(PARTITION BY userID)

You can filter with a correlated subquery. The good thing about the current format of your string dates (yyyymmdd) is that it can be properly sorted, so this should just work:
select
[userID],
[company],
DateSerial(Left([DateOfSale], 4), Mid([DateOfSale], 5, 2), Right([DateOfSale], 2)) AS SaleDate
from mytable t
where t.SaleDate = (
select min(t1.[SaleDate]) from mytable t1 where t1.[userID] = t.[userID]
)

Count and max aggregate function in same table in one query

I have to do count and max aggregate function in same query. For example I have history table contains date column. I need to retrieve the latest date as well as count () with some criteria. Criteria is applicable for only count() . I am able to retrieve the latest date using max and rank function.But could not merge both. Could you please assist?
Update:
Scenario : Customer buys/sells Shares.
Input: Table Share_history and Table Customer and Table Share and Table Share_Status
Customer :
Cust_id |Cust_name
1 |A
2 |B
Share :
Share_id|Share_Name|Owner|
10 |ABC |XYZ |
20 |BCD |MNC |
Share_Status :
Share_Status_Id|Share_Status_Name
1 |Buy
2 |Sell
Share_history :
Share_history _id|Share_id|Trans_date|Share_status_Id|Cust_id
100 |10 |12/12/14 | 1 |1
101 |10 |24/12/14 | 2 |1
102 |10 |14/01/15 | 1 |1
103 |10 |28/02/15 | 2 |1
103 |10 |16/03/15 | 1 |1
Output: latest Trans_date and count(no of times specific share was bought(1)) and Cust_id=1.
Query:
select share1.Share_id,SHAREHIST.Latest_Date,SHAREHIST.buycount
from Share share1 left outer join
(select share_id,max(Trans_date) keep(dense_rank last order by share_id) as Latest_Date,
(select count(*) as buycount from Share_history where Share_status_id=1 and Share_id=share1.Share_id)
from Share_history
group by Share_id
) SHAREHIST
on SHAREHIST.share_id=share1.share_id
EXPECTED :
Share_id|Latest_Date|buycount
10 |16/03/15 | 3

Try using this:
SELECT
Share_id
,Trans_Date
,COUNT(Share_id) buycount
FROM
(
SELECT
*
FROM Share_history SH
WHERE Trans_Date = (SELECT MAX(Trans_Date) FROM Share_history)
) SH
GROUP BY Share_id, Trans_Date
Rest of the joins I think you can add.

I think you just want aggregation:
select sh.share_id, max(trans_date) as trans_date, count(*) as buy_count,
from share_history sh
where cust_id = 1
group by sh.share_id;

SQL query to return a grouped result as a single row

If I have a jobs table like:
|id|created_at |status |
----------------------------
|1 |01-01-2015 |error |
|2 |01-01-2015 |complete |
|3 |01-01-2015 |error |
|4 |01-02-2015 |complete |
|5 |01-02-2015 |complete |
|6 |01-03-2015 |error |
|7 |01-03-2015 |on hold |
|8 |01-03-2015 |complete |
I want a query that will group them by date and count the occurrence of each status and the total status for that date.
SELECT created_at status, count(status), created_at
FROM jobs
GROUP BY created_at, status;
Which gives me
|created_at |status |count|
-------------------------------
|01-01-2015 |error |2
|01-01-2015 |complete |1
|01-02-2015 |complete |2
|01-03-2015 |error |1
|01-03-2015 |on hold |1
|01-03-2015 |complete |1
I would like to now condense this down to a single row per created_at unique date with some sort of multi column layout for each status. One constraint is that status is any one of 5 possible words but each date might not have one of every status. Also I would like a total of all statuses for each day. So desired results would look like:
|date |total |errors|completed|on_hold|
----------------------------------------------
|01-01-2015 |3 |2 |1 |null
|01-02-2015 |2 |null |2 |null
|01-03-2015 |3 |1 |1 |1
the columns could be built dynamically from something like
SELECT DISTINCT status FROM jobs;
with a null result for any day that doesn't contain any of that type of status. I am no SQL expert but am trying to do this in a DB view so that I don't have to bog down doing multiple queries in Rails.
I am using Postresql but would like to try to keep it straight SQL. I have tried to understand aggregate function enough to use some other tools but not succeeding.

The following should work in any RDBMS:
SELECT created_at, count(status) AS total,
sum(case when status = 'error' then 1 end) as errors,
sum(case when status = 'complete' then 1 end) as completed,
sum(case when status = 'on hold' then 1 end) as on_hold
FROM jobs
GROUP BY created_at;
The query uses conditional aggregation so as to pivot grouped data. It assumes that status values are known before-hand. If you have additional cases of status values, just add the corresponding sum(case ... expression.
Demo here

An actual crosstab query would look like this:
SELECT * FROM crosstab(
$$SELECT created_at, status, count(*) AS ct
FROM jobs
GROUP BY 1, 2
ORDER BY 1, 2$$
,$$SELECT unnest('{error,complete,"on hold"}'::text[])$$)
AS ct (date date, errors int, completed int, on_hold int);
Should perform very well.
Basics:
PostgreSQL Crosstab Query
The above does not yet include the total per date.
Postgres 9.5 introduces the ROLLUP clause, which is perfect for the case:
SELECT * FROM crosstab(
$$SELECT created_at, COALESCE(status, 'total'), ct
FROM (
SELECT created_at, status, count(*) AS ct
FROM jobs
GROUP BY created_at, ROLLUP(status)
) sub
ORDER BY 1, 2$$
,$$SELECT unnest('{total,error,complete,"on hold"}'::text[])$$)
AS ct (date date, total int, errors int, completed int, on_hold int);
Up to Postgres 9.4, use this query instead:
WITH cte AS (
SELECT created_at, status, count(*) AS ct
FROM jobs
GROUP BY 1, 2
)
TABLE cte
UNION ALL
SELECT created_at, 'total', sum(ct)
FROM cte
GROUP BY 1
ORDER BY 1
Related:
Grouping() equivalent in PostgreSQL?
If you want to stick to a simple query, this is a bit shorter:
SELECT created_at
, count(*) AS total
, count(status = 'error' OR NULL) AS errors
, count(status = 'complete' OR NULL) AS completed
, count(status = 'on hold' OR NULL) AS on_hold
FROM jobs
GROUP BY 1;
count(status) for the total per date is error-prone, because it would not count rows with NULL values in status. Use count(*) instead, which is also shorter and a bit faster.
Here is a list of techniques:
For absolute performance, is SUM faster or COUNT?
In Postgres 9.4+ use the new aggregate FILTER clause, like #a_horse mentioned:
SELECT created_at
, count(*) AS total
, count(*) FILTER (WHERE status = 'error') AS errors
, count(*) FILTER (WHERE status = 'complete') AS completed
, count(*) FILTER (WHERE status = 'on hold') AS on_hold
FROM jobs
GROUP BY 1;
Details:
How can I simplify this game statistics query?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Select rows that are duplicates on two columns - sql

Just as juergen d suggested, you should try to add Interval in your GROUP BY. Like so: SELECT AC.ContactInfo , AC.Interval , COUNT() AS qnty FROM AllCalls AS AC GROUP BY AC.ContactInfo , AC.Interval HAVING COUNT() > 1

The code should like this : select Interval , ContactInfo, count(ID) AS NumCalls from AllCalls group by Interval, ContactInfo having count(ID)>1;

Related

If condition TRUE in a row (that is grouped)

Oracle SQL - How to return the name with the highest ID ending in a certain number

How do I select oldest date values per group while using DateSerial function in Microsoft Access, using SQL?

Count and max aggregate function in same table in one query

SQL query to return a grouped result as a single row

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Select rows that are duplicates on two columns - sql

Just as juergen d suggested, you should try to add Interval in your GROUP BY. Like so: SELECT AC.ContactInfo , AC.Interval , COUNT(*) AS qnty FROM AllCalls AS AC GROUP BY AC.ContactInfo , AC.Interval HAVING COUNT(*) > 1

The code should like this : select Interval , ContactInfo, count(ID) AS NumCalls from AllCalls group by Interval, ContactInfo having count(ID)>1;

Related

If condition TRUE in a row (that is grouped)

Oracle SQL - How to return the name with the highest ID ending in a certain number

How do I select oldest date values per group while using DateSerial function in Microsoft Access, using SQL?

Count and max aggregate function in same table in one query

SQL query to return a grouped result as a single row

Categories

Resources

Just as juergen d suggested, you should try to add Interval in your GROUP BY. Like so: SELECT AC.ContactInfo , AC.Interval , COUNT() AS qnty FROM AllCalls AS AC GROUP BY AC.ContactInfo , AC.Interval HAVING COUNT() > 1